Automated Data Analysis “RakuDA” Team RakuDA
Distributed data processing platform SE project
Sakai：The research area of distributed data processing platform SE project includes deep learning and machine learning. Our team is in charge of the machine learning. We work on automated data preprocessing, algorithm selection and parameter tuning in machine learning. In addition, our project has various research themes such as a deep learning platform for cost reduction, large scale sensor data accumulation system for the next generation mobility society including connected cars and a spatiotemporal data management platform technology for high-speed search with latitude, longitude and time.
Shioda：We have applied machine learning for NTT affiliate companies to create new business with data analysis and its platform. In such activities, our team noticed data scientists repeat similar processes and thought the automation of such processes could reduce their workloads. We started the research in early 2015. Just then, the need for data analysis with a keyword “big data” was rising. We realized such trend have expected us to focus on automated data analysis . We continued the research and released the product in 2018.
Oikawa：We started the initial development in FY 2017 after spending about 2 years for research, prototyping, and proof of concepts to confirm the effectiveness of the automated data analysis. The key idea of RakuDA is exploration and exploitation of various data preprocessing techniques, machine learning algorithms and their hyperparameter settings and RakuDA execute the search procedure in parallel on multiple workers. The development scale is large and we struggled to stabilize the quality and performance.
Sakai：Efforts on “citizen data scientists” is widely made in our company. Now constructing a model with the machine learning and deep learning is only for people who have sufficient knowledge. Therefore, we made easy-to-use interfaces for those who usually use Microsoft Excel in statistical processing, and now users can make models in a slightly easy manner. Current version of “RakuDA” is already considerably simplified, but a certain level of expert knowledge is still necessary. So we are going to evolve RakuDA simpler to increase the number of users. On the other hand, making our product “deeper”, we also tackle the research of automated deep learning.
Ishii：I worked for an NTT affiliate company and was in a position using this product. Before the “RakuDA” development is completed, the research team explained the outline of the product. I thought the product could be used immediately as a data analysis tool. In mid-2018, I joined this team and I would like to contribute to the team by establishing a strategy to make the “RakuDA” more useful to business and customers.
Sakai：My first priority is deciding the direction of my team and preparing resources so that the members can work easily. I entrust the technical works to the other team members because their machine learning knowledge is superior to me.
Ishii：I am mainly in charge of making strategies on how to use “RakuDA” in business and drawing up a roadmap for future R&D in data analysis field. In addition, I am managing “RakuDA” development project to add new functionalities and improve performance.
Oikawa：I am in charge of technical aspects of development and grand architecture of “RakuDA”. In order to promote our product, I sometimes visit NTT affiliate companies and help the users to analyze their data with our product.
Shioda：My mission is designing and developing the algorithms of “RakuDA”, i.e. data preprocessing, machine learning and hyperparameter optimization algorithms. I want to increase the number of RakuDA users and make the product deeper; I now tackle a research of automated neural network architecture design, e.g., neural architecture search.
Maki：I am in charge of research for expanding the usage applications of automated data analysis. For that purpose, we visit NTT group companies and extract issues by applying “RakuDA” to their business data. In addition, as a data scientist, I provide insights of data analysis to NTT group companies for improving their business performance.
Shioda：I feel challenging to develop one product with whole team in automated data analysis domain. Each member has highly specialized skills, and I enjoy working together by performing maximum abilities toward archiving the goal.
Ishii：Our team has talented members who makes up some portions of my knowledge. As my team expands my world and gives various insights to me, I feel that this team helps me grow and the quality of products is getting great.
Sakai：I feel our team is very well-balanced because it consists of members with different specialized fields. I hope I could gather their ability well to provide best performance.
Ishii：I believe providing easy analysis tools would be a great advantage in the future to solve the data scientist shortage problem. The world everyone can analyze their own data by themselves on site allows us to shorten lead time from data analysis to business utilization. I also think this accelerate our business itself and contribution to sales.
Oikawa：I think data analysts should want to work comfortably and the automation system provides a big advantage. Simple and boring tasks should be done automatically with machines and then humans should focus on tasks that human knowledge must be required. I think the automation will help improve performance of business.
Sakai：What “RakuDA” aims at is exactly the case. I believe, therefore, it is important to research each necessary elemental technology for automation one by one. We steadily address this step by step.
Distributed data processing platform SE project