From CSV to API – Python 3 parsing

In order to practice a bit with Python language, we will exercise on a simple application featuring the following implementations :

  • Load the application using the path of a csv file
  • Read the data from the csv into a Python dictionary
  • Use a json template file having an array of objects and replace with the data present in the dict collection type.
  • Send the json data onto an API (mockbin)

Project set up

Before we dive into the application, let’s configure our environment.

Personally I’m using Pycharm community edition to write code and I just need to link it to a version of Python I own on my computer (ex : 3.6…).

Once you prepared your IDE, you should be having a built in virtual environment on which you can execute your code base.

Pycharm venv

For those who are not using Pycharm or any advanced IDE, you can also build your own virtual environment using the following command :

python3 -m venv .

Source files

There are 3 files on which depends the program, using arguments you may switch those files according to your needs.

CSV file

Let’s see a sample (data.csv) :

person_id,person_name,trade_amount,trade_type,date_of_trade
1,John Doe,100.0,BUY,2021-01-01 10:00:00
2,Igor Temo,85.0,BUY,2021-01-02 11:00:05
3,Felicia Tore,1050.0,SELL,2021-01-03 12:00:15
4,Bob Lep,18.5,SELL,2021-01-04 14:00:00
5,Yvette Cord,258.0,SELL,2021-01-04 17:00:00
6,Fanny Koi,1008.20,BUY,2021-01-05 01:00:00
7,Donald Duck,468.50,BUY,2021-01-06 10:00:00

The job is to read from the above file and use a json template on which we put replace some data and send the whole to an API.

Json template

We use a template (template.json) to add default values for fields, we could also have chosen variables into the code.

[
    {
        "name": "Trade operation on broker",
        "type": "OPERATION_TO_REPLACE",
        "date": "DATE_OF_OPERATION",
        "amount": 0.0,
        "currency": "EUR",
        "sandbox": false,
        "broker": {
            "id": 55
        },
        "rights": [
            {
                "id": 10
            },
            {
                "id": 11
            }
        ],
        "trader": {
            "name": "NAME_TO_REPLACE",
            "id": "ID_TO_REPLACE"
        }
    }
    ,
     {
        "name": "Sandbox Trade operation",
        "type": "OPERATION_TO_REPLACE",
        "date": "DATE_OF_OPERATION",
        "amount": 0.0,
        "currency": "EUR",
        "sandbox": true,
        "broker": {
            "id": 55
        },
        "rights": [
            {
                "id": 1
            }
        ],
        "trader": {
            "name": "NAME_TO_REPLACE",
            "id": "ID_TO_REPLACE"
        }
    }
]

The above template will be read by our program which will replace the fields (see the ones matching the words in upper case ?) using the lines included in our csv.

Configuration parameters

Another entry for our application is the configuration file (config.yml) which embeds the parameters to join the API on which will send the updates.

Depending on your API, you will pass in other parameters (ex: Api key, other credentials, url…)

api:
  url: http://mockbin.com/request
  user: admin
  password: my_pass

Additional parameters could also be included, we will see in another article how to plug in a mailing feature.

Libraries to include

There are a bunch of libraries to import in our program and as we are using Python 3 most of them are already provided.

Into the requirements.txt file, we only need to add the following :

requests
pyyaml

The former one is for API calls and the later for yaml parsing.

Besides those ones, we also loaded argparse (for argument parsing) and logging within our main application (main.py).

Project code base

Here we are !

I’ve decided not to implement unit testing but rather provide in each python file a way to run it independently from each other.

I invite you to refactor the code using Unit tests.

Also there is no object oriented implementation. Api calls and yaml, json, csv parsing could have been refactored using OOP.

Json Template parser

Let’s start with the template reader (template.py)

import json


def json_reader(path):
    with open(path) as json_file:
        return json.load(json_file)


def to_json(data):
    multiple = []
    for d in data:
        multiple.append(json.dumps(d, indent=4))
    return multiple


if __name__ == '__main__':
    all_data = json_reader('template.json')
    for data in all_data:
        print(data['name'])
        print(data['type'])
        print(data['amount'])
        print(data['broker']['id'])
        print(data['rights'][0]['id'])
        print(data['trader']['name'])

    all_data = to_json(all_data)
    for a in all_data:
        print(a)

No big deal here, we simply used the json Python library and the documentation on loads & dumps.

Only 2 functions plus the basic Python main() which you can try using the template provided.

 

Python Main entry - Json Template reader

CSV reader

Nothing fancy here as well, we stick with the csv Python library and include a main() to test the functionality by echoing the content of the resulting dictionary.

import csv


# person_id, person_name, trade_amount, trade_type, date_of_trade
def read_csv(path='data.csv'):
    with open(path) as csv_file:
        csv_reader = csv.reader(csv_file, delimiter=',')
        line_count = 0
        all_data = []
        for row in csv_reader:
            # we do skip the header line
            if line_count > 0:
                all_data.append({
                    'personId': row[0],
                    'personName': row[1],
                    'tradeAmount': row[2],
                    'tradeType': row[3],
                    'dateOfTrade': row[4]
                })
            line_count += 1
        return all_data


if __name__ == '__main__':
    data = read_csv('data.csv')
    for trade in data:
        print(f'\tPerson id : { trade.get("personId") }, Person name : { trade.get("personName") }, Trade amount : { trade.get("tradeAmount") }, Trade type : { trade.get("tradeType") }, Trade date : { trade.get("dateOfTrade") }')

Csv Python parser

Api post request

Here is another important script dedicated to API calls.

For the moment only post calls are available as we only need those sort of calls inside our application.

You may learn more about API calls following this resource.

Or for a more advanced view of the Requests HTTP library : this one.

And the source code here.

The main() script entry provides a test on MockBin API.

Do not hesitate to alter the content of the script for your own use cases !

import requests
import json


def post(url, username, password, json_body):
    headers = {'content-type': 'application/json'}
    print('_request [POST]', url, username,)
    response = requests.post(url, data=json.dumps(json_body), auth=(username, password), headers=headers, verify=True)
    print("  => ", response.status_code)
    return response


if __name__ == '__main__':
    entry = {
        'name': 'Bob',
        'date': '2021-01-07 10:00:00'
    }
    res = post('http://mockbin.com/request', '', '', entry)

    print(json.dumps(res.json(), indent=4, sort_keys=True))

Mockbin api post request
Mockbin API Post request with headers

We tried to keep things simple here. We only have a function for posting and it features :

  • authentication
  • Json content type in headers
  • Json formatting of our payload (coming from a dict type)

Besides the above, you likely noticed the presence of the following parameter :

verify=True

You can turn it off when you are testing on a HTTPS endpoint to ignore the certificate verification.

Well now it is time to plug in everything together !

The main program

Here is what our main.py contains :

 

import argparse
import json

import template_reader
import csv_reader
import yaml
import api
import logging

logging.basicConfig(format='[%(asctime)s] %(levelname)s: %(message)s', datefmt='%m/%d/%Y %I:%M:%S')
logger = logging.getLogger('main')
logger.setLevel(logging.DEBUG)


def merge(template_data, csv_data):
    for cd in csv_data:
        for td in template_data:
            td['type'] = cd.get('tradeType')
            td['date'] = cd.get('dateOfTrade')
            td['amount'] = cd.get('tradeAmount')
            td['trader']['id'] = cd.get('personId')
            td['trader']['name'] = cd.get('personName')

    return template_data


def get_config(args):
    with open(args.config, 'r') as stream:
        try:
            return yaml.safe_load(stream)
        except yaml.YAMLError as exc:
            logger.error(exc)


def parse_cli():
    parser = argparse.ArgumentParser(description='Send CSV over to API')
    parser.add_argument('--csv', help='csv data file path', required=False, default='data.csv')
    parser.add_argument('--json', help='data template file path', required=False, default='template.json')
    parser.add_argument('--config', help='Config file path', required=False, default='config.yml')
    return parser.parse_args()


if __name__ == '__main__':

    cli_args = parse_cli()
    config = get_config(cli_args)

    data = csv_reader.read_csv(cli_args.csv)
    template = template_reader.json_reader('template.json')

    multi_json_to_post = merge(template, data)

    for json_to_post in multi_json_to_post:
        res = api.post('http://mockbin.com/request', '', '', json_to_post)
        logger.info('Results %s', json.dumps(res.json(), indent=4, sort_keys=True))

Bulk post to API from CSV file

If you correctly set up your environment, you should be resulting in multiple calls to your API.

However beware that we assumed that the API does not handle bulk requests, that is the reason why you end up we 1 call per csv entry.

If your API does handle it (and it should !) then you don’t need the for loop and split the calls – you should be able to send an array of objects.

Instead of prints, I have shipped a logger for the fun of using it.


I hope that you enjoyed this tutorial, if you have any concerns, hit me up !

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Want more information?

Related links will be displayed within articles for you to pick up another good spot to get more details about software development, deployment & monitoring.

Stay tuned by following us on Youtube.

%d bloggers like this: