It’s been three months since the GSoC started and it’s come to the end. First of all, I greatly appreciate my mentors (jmaher and armenzg) . I have learned tons of things from them and it would have been impossible to get here without their help. I also must say thank you to dustin, martiansideofthemoon and anyone who has helped me during these past three months. I really enjoy working with the people in the A-team and the Mozilla community. I believe I will keep being involved as long as possible!
This GSoC project involved several parts- (1) refactor SETA’s code and make it robust (2) make SETA work on Heroku (3) Reduce database size and use data from other systems (4) Integrate with the Gecko decision task
For code refactoring, we’ve fixed many existing pep8 errors and added flask8 support in #PR87 . We’ve also verified and removed some redundant packages from requirement.txt because we no longer needed them. Furthermore, in order to make the codebase more readable and easy to test, we started using sqlalchemy to do the database operations instead of embedded SQL statements and added some tests for it in #PR109. Sqlalchemy turned out to be a fantastic choice for us because it does not only makes tests easier to write, but it also give us faster database querying and storing. In #PR84, we use fetch_json instead of the pushlog endpoint which make things cleaner. As a bonus point, we fixed insert key errors in #PR92 and added a failure column in #PR82. We’ve also made SETA display job results appropriately in #PR90. In #PR91, we made linux64 debug jobs be visible on SETA. At the moment it’s not only useful on linux64 debug jobs but it also works for other job data that comes from Taskcluster.
The second part of this project is make SETA works on heroku, and all the related PRs are included in MikeLing/ouija-rewrite branch. First of all, we need migrate our database from mysql to postgresql(it’s default database on heroku) and things become much more easier after we switch to the sqlalchemy [PR]. Secondly, we need to make updatedb.py and failures.py(we use these two scripts to update our database and store our analysis results) running automatically[PR]. Then, we add a stage server for SETA, it could do pre-deployment validation as what has been done in treeherder and avoid breaking something accidently in the target server. I must say thank you to armenzg again because he gives me a lot of helps on this and helps me fork repo to the stage server. Anyway, we couldn’t works well on the heroku without armen’s help 🙂
The final piece of this project is to integrate with the Gecko decision task. On the server side, we separated Taskcluster jobs from Buildbot jobs and started listing all low value jobs to ensure that we run brand new jobs by default in #PR101 and #PR113. TaskCluster can query the low value job list from the server side and can create decision task based on it (You can check it out on http://seta-dev.herokuapp.com/data/setadetails/?taskcluster=1)
We also found a way to identify new jobs from runnablejobs.json and remove expired jobs from preseed.json in #RP112. In bug 1287018, we try to figure out how to make TaskCluster use data from SETA to schedule tasks and I committed several patches about it. The Gecko decision task is the vital part for our task scheduling and a lot of things need to be discussed in there. This is now a stretch goal for this project and I will keep working on it after GSoC work period :).