Csv files must frequently be joined. It would be fantastic if we could connect CSV files using the power of SQL. This script accomplishes that. It executes SQL that is provided as an input parameter. In this example, I've made two csv files. This script may be expanded to handle multiple csv files, multiple column joins, group by, order by, and limit operations. Without this script, one may have to utilize vlookup, which might be expensive, time-consuming, and challenging to implement.
employees.csv
emp_id,emp_name,dept_id 1,Paul,1 2,John,1 3,Jane,1 4,Barb,2 5,Cody,2 6,Bill,2
departments.csv
dept_id,dept_name 1,Sales 2,Marketing
I wrote a python script called ExecuteSqlForCsv.py and I can join the employees.csv and departments.csv files
python3 ExecuteSqlForCsv.py "SELECT emp_name, dept_name FROM employees.csv, departments.csv WHERE employees.csv.dept_id = departments.csv.dept_id" The below output gets produced:
emp_name dept_name 0 Paul Sales 1 John Sales 2 Jane Sales 3 Barb Marketing 4 Cody Marketing 5 Bill Marketing