Project Overview
- This was a project started during my internship at IBM Almaden Research Center, which I continued working on while I went back to grad school to wrap up my graduation
- The purpose of this project was to generate domain specific semantics labelled data for ESSP to use to work on finance domain and compliance (legal contracts) domain, which was addressing the needs of some of our biggest clients
- This involved working with domain experts - investors / accountants and paralegals to help with the labeling
- I individually took care of the hiring, labeling budgeting and explanation of the labeling task to the experts, along with random batch reviews of the labeled task with them
- I developed a slack like application using ElectronJS within a week, which was new to me to make it easier for the labelers to select tokens and label them using button clicks
- The final product after quality checks by me and my colleagues was open sourced as FinProp and ConProp as part of IBM's Data Exchange Programme
- In addition to providing domain knowledge to the Watson OneNLP library, this work provides anyone in the tech community access to domain specific NLP and Computation Linguistics abilities
Skills
Computation Linguistics and the years of effort behind NLP systems
This is when I started learning in depth about the constituents of our everyday language and how each one of them comes together to provide such rich meaning. This also brought into perspective how intelligent and knowledgeable our NLP systems must be to keep up with the beautifully complex human language. I got to learn about proposition banks, verbnet, framenet, CoNLL tasks, verb roles, arguments surrounding them and many more.
Working with domain experts
This was my first time working with domain experts and acting as the bridge between their knowledge and enabling it's technicalization into an NLP task dataset. This was a very satisfying project that allowed me along with others to contribute one more small part to the progress of NLP and ML community as a whole.
Hiring, budgeting, planning and supervising
This was the first time I was given the responsibility of hiring domain experts within a fixed budget and oversee the process end-to-end including training the labellers on the process and performing quality reviews. With the ues of occassional advice from seniors, I was able to execute this whole process end to end within 4 months, two months out of which I was also doing grad school on the side. This process helped me develop a lot of important soft skills.