Creating Singer Target to Send Data to Web Endpoint
Please Note: If you are new to Singer, you may want to check out my last post Creating Singer Tap to Capture Ebay Completed Items. It provides a high level background of the specification and how taps & targets work together. |
The Challenge
Last week I created a walk through of Creating Singer Tap to Capture Ebay Completed Items. While it's great to capture data, it's not overly useful without persisting the data to a target destination.
There are some useful targets posted on signer.io and posted on meltano for writing to a nice variety of standard destinations (databases, cloud data warehouses, S3, csv, json, etc.). However my site has an API and I could not find a target to send to a web endpoint.
I want to able to pipe data from a Singer Tap to my own API endpoints
Creating target-web_endpoint
Code for this Singer Target is posted on github (click on image below):
Project Scope
So the goal is to create a Singer target that will allow us to take data piped from a Singer tap and send it to a web endpoint (via a HTTP GET or HTTP POST). Below is a visual illustration:
This target must be able to support the following requirements:
- Sending record data (piped from a tap) to a url endpoint via HTTP GET or HTTP Post.
- Configuration of basic auth credentials and HTTP Headers for HTTP Post method.
- Configuration to map source data field names to target system's data field names.
- Configuration to specify additional properties (with static values) to send to endpoint.
- Configuration to specify VERY BASIC filter rules based on the record values.
Helpful links to get background on Developing Singer Targets
- Singer provides getting started docs on creating targets.
- Meltano's slack has a dedicated channel #singer-target-development for help developing targets
Setting up development for Singer Target
In order to develop a tap, we need to install the Singer library:
1
pip install singer-python
Next we'll install cookiecutter and download the target template to give us a starting point:
1
2
3
4
pip install cookiecutter
cookiecutter https://github.com/singer-io/singer-targer-template.git
project_name [e.g. 'tap-facebook']: target-web_endpoint
package_name [target_web_endpoint]:target_web_endpoint
Configuration file for the Target
There is a template you can use at config.json.example, just copy it to config.json in the repo root and update the following values:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
{
"method" : "POST",
"url": "https://api.some_web_site.com/lisitngs/",
"username": "my_username",
"password": "my_password",
"post_headers" : {
"Content-Type": "application/x-www-form-urlencoded"
},
"property_mapping": {
"field1": { "target_field_name": "target_field_1"},
"field2": { "target_field_name": "target_field_2"},
"field3": { "target_field_name": "target_field_3"},
"field4": { "target_field_name": "field_4"},
"field5": { "target_field": "field_5"}
},
"additional_properties": {
"system_id": 12,
"special_key": "0cf18148-1687-11ee-be56-0242ac120002"
},
"filter_rules": {
"field1": { "type": "equals", "value": true },
"field2": { "type": "not_equals", "value": "123" },
"field3": { "type": "is_empty", "value": false }
}
}
Variable | Description |
method | method for calling url (GET or POST), default is GET |
url | endpoint url REQUIRED |
username | user name for basic auth (only for POST) |
post_headers | dict of headers to pass (only for POST) |
property_mapping | define the properties received from tap to be sent to the endpoint. You can update the target property names) |
additional_properties | define additional properties with hard coded values that will be sent to the endpoint |
filter_rules | configure rules to only include records when matching all criteria. |
Notes about filter_rules:
1. Records will be sent to the endpoint only when they are valid for all the configured rules.
2. You can only identify one rule for each field.
3. There are 5 supported types of rules:
Rule Type | Description |
equals | the field's value must equal the configured value |
not_equals | the field's value must not equal the configured value |
contains | the field's value must contain the configured value |
not_contains | the field's value must not contain the configured value |
is_empty | if true, the field's value must not be empty. if false, the field's value must be empty |
Setting up to run the Target
Let's create a virtual environment to run our tap within:
1
2
3
4
5
6
7
8
cd target-web_post
python3 -m venv ~/.virtualenvs/target-web_endpoint
source ~/.virtualenvs/target-web_endpoint/bin/activate
git clone git@github.com:jaygrossman/target-web_endpoint.git
cd target-web_endpoint
pip install requests
pip install -e .
deactivate
We can pipe the output of a tap to our target with the following command (after the | symbol):
1
run_your_tap | ~/.virtualenvs/target-web_endpoint/bin/target-web_endpoint
EXAMPLE: Running Tap-Csv + Target-web_endpoint
I created a sample_data folder in the project's github repo that includes:
- sample_data.csv file contains a github keyword search
- tap-csv.config.json file contains config for the tap-csv
- target-web_endpoint.config.json file contains config for the target-web_endpoint
Calling this thread will try to search github (https://github.com/search) via a HTTP GET request with the keywords supplied in sample_data.csv.
Install tap-csv:
1
2
3
4
python3 -m venv ~/.virtualenvs/tap-csv
source ~/.virtualenvs/tap-csv/bin/activate
pip install git+https://github.com/MeltanoLabs/tap-csv.git
deactivate
We can run tap-csv piped to our target-web_endpoint with the following command:
1
~/.virtualenvs/tap-csv/bin/tap-csv --config sample_data/tap-csv.config.json | ~/.virtualenvs/target-web_endpoint/bin/target-web_endpoint --config sample_data/target-web_endpoint.config.json
The command outputs the following:
2023-06-29 23:17:10,957 | INFO | tap-csv | Beginning full_table sync of 'seaches'...
2023-06-29 23:17:10,957 | INFO | tap-csv | Tap has custom mapper. Using 1 provided map(s).
2023-06-29 23:17:10,957 | INFO | singer_sdk.metrics | METRIC: {"type": "timer", "metric": "sync_duration", "value": 0.000225067138671875, "tags": {"stream": "searches", "context": {}, "status": "succeeded"}}
2023-06-29 23:17:10,957 | INFO | singer_sdk.metrics | METRIC: {"type": "counter", "metric": "record_count", "value": 1, "tags": {"stream": "searches", "context": {}}}
url: https://github.com/search?q=tap-ebaycompleted, response: <Response [200]>
{"bookmarks": {"searches": {}}}