A Node.js application to parse YAML files
These past two weeks I have been interning at Infibeam — a company that manages all of India’s public procurement using their online Government eMarketplace (GeM) platform — as a Technology Analyst. The company’s Data Warehouse in Bangalore sends all its data as YAML files, but the Analytics team needs this data in a database for analysis. So I was given the job of automating the process of transferring the YAML data into an SQL database. To do so, I am building a Node.js service that first converts YAML to JSON and then inserts it into a DB.
Here, I will be discussing my use of JS-YAML to accomplish this task.
JS-YAML - YAML 1.2 parser
js-yaml is the most popular library to convert YAML files to JSON.
To install or use js-yaml
1 | $ npm install js-yaml |
Getting started is very easy
1 | yaml = require("js-yaml"); |
The code above reads your YAML file and then prints it out in JSON. This alone is enough to parse most YAML files. But what if my YAML has custom tags in it? Life will get just a little bit tough then as we will have to create a Schema to parse the YAML. The online literature on this topic is sparse, which is why I wanted to share this with you all. So here is a simple fix for this.
Parsing Custom YAML tags
1 | let json_data = { |
What this does is create a schema for your CustomType
and then passes that schema on to the safeLoad
function so that when the file is being parsed, the customTypeName
is parsed as a custom YAML tag. This will fix most unknown tag
errors that you might be getting when parsing YAML files.
Summary
So this was it: a simple fix to a very annoying problem you might come across when parsing YAML files. The YAML data I am working with is enormous, so I have had to create a host of schemas to parse it. I wish there was a way of automating the schema generation process as well.