Parismita Das \(๑•́o•̀๑)/

Experience!

My Story as a Google Summer of Code 2018 participant

I am Parismita Das, studying engineering physics in IIT-Guwahati. I was selected as google summer of code(GSoc) student in 2018, by the R Foundation organisation. I worked on the project Maximum Margin Interval Trees (MMIT). Google Summer of Code is not just a work experience, It taught me various life lessons as well, like rising up from failures, the awesome experience working in collabs, commitment towards work, peoples expectations, value of maintaining a schedule, the competition we need to face in this world, where do I stand and how to move forward and many more.

Getting Started

In 2016 October, when I was a sophomore, I came to know about GSoc from my friends and collegues. Being a sophomore, I had no idea what I want with my future, whether I want to study physics or become a software developer or a Robotics Researcher. But one thing I was sure of, that is, I would like to try as many fields as I can before commiting into one. Hence I decided that I would like to get into the GSoc program in next summers.

Initially, I assumed that we just need good coding skills, apply to some orgs, do some commits and thats all. Hence I started viewing the organisations, deciding on which org to apply, I selected CloudCV and OpenCV to begin with. As it was my 1st time with open source, I had no idea where to start from. So I just went to their github repo and started reading the contributers blog and interacting with the people from the org. I started solving easy issues and bugs and soon got engages with larger tasks. During that time, I had started working on machine learning projects and wanted to learn the subject in depth, and also gain some experience on building ML product from scratch. But by the end of one month I realised that if I am putting so much effort on something, I should put in in a subject that I love the most.

Hence, I moved on to searching new organisations that work on machine learning and also, that can be done with a knowledge level that of mine. I finally found that R has some cool projects, that we need to build from scratch and we get to choose on which field we want to work, from a sea of projects.

First Failure

In 2016 Dec, I picked put the project on Constrained Heirarchical Clustering. Organisations like R, CERN, IAI has a unique way of selectign students. They giveyou small test or assignment and select the student based on their test. So, I started working on Constrained HAC tests in Jan 2017. I was able to solve all questions asked by the end of Feb. In march, It was time to write proposals, I kept good contact with the mentors and edited the proposal as suggested in a lots of blogs of succesfull coders of hacks to clear GSoc. But I was ranked second on the selection list. I believe the reason, I was not selected was my documentation was not good enough. There was a very fine difference between the student that was selected and me, and that was his clear timeline in this proposal and his awesome documenting skills. He had made a whole website on the test solutions while I made a text file.

And it was the end of the 1st phase. I started a little bit of preparation for the next years GSoc and made plans for next year. With GSoc 2017, I learned that I must improve my knowledge on datastructures and algorithms, and improve my documentation skills.

Second Attempt

I started preparing again on Dec 2017. This time I was very selective that I wanted to work specifically on machine learning or particle physics related problem statement. I went throught the projects plotted by R for GSoc 2018, and found the exact project that matches my interest. This time I started a month early and tried to complete the tests till Jan 2018, So that I’ll get enough time for documentation. And I was very much on schedule. This time I completed the tests along with documentation using github pages by Jan beginning. And started working on the proposal and learning about the MMIT python package by mid-Feb.

Hopefully I got selected. I was so happy because it proved that I improved myself a bit on my coding skills as well as documentation skills and also because I was selected ofcourse :P

Community Bonding

This was the time when we had to learn about our work, make a perfect timeline, work on learning previous packages etc. It mostly went on reading the c++ codes and the dynamic programming paper and understanding what the package does.

First Evaluation

I started working on the 1st part of the project which consisted of building the MMIT tree and pruning algorithm along with crossvalidation. By the 1st evaluation I had completed the MMIT and pruning algorithm, but the crossvalidation was pending. By the end of 1st evaluation, I started realising maybe the schedule made by me was a little unrealistic and I should have added more time to pruning and cross-validation code, as it was not as easy as it looked.

Second Evaluation

By second evaluationn, the random forest was to be completed but I was still struggling with the crossvalidation code. It was more complicated than expected, but evenatually, it got completed by end of second evaluation and I caught up the speed and was back on original track.

Final Evaluation

I was supposed to complete building the 1st version of the package by final evaluation and hopefully we were over with debugging the code, writing all the parts such as random forest and its cross-validation etc. Adaboost was still remaining, but we focused on completing a bug free package, with documentation. So untill the Final evaluation, we focused mostly on documentation and thoroughly checking the package.

GSoc Ends, What now?

After completing the 1st version of the package, and after the end of GSoc. I am still working with the same project on the adaboost part and improving the efficiency of the project for the version 2. GSoc gave me the platform to collaborate with such big organisation and taught me about open sourse, organisation culture, team work, etc. It gave me the opportunity to interact with such great people in technical field who are working on awesome problem statements. I believe its a great program, that helps students like me, to gain some experience in the field you want to build your career.

Thats it, thats how GSoc helped me find a new world to work on and a new journey has started, with new challelenges and opportunities.

Heres The link to the Package tutorial : https://aldro61.github.io/mmit/tutorials/R/