Company Overview: We are a pioneering organization in the field of generating training data for all leading large language models to advance Artificial General Intelligence (AGI), esp in the domains of coding, advanced reasoning, planning, STEM, etc.. Our vision is to design the best systems to combine human knowledge and model capability into training data for the next generation of LLM designed towards AGI. Job Overview: We are seeking a highly skilled Research and Engineering lead to:Collaboration to understand data needs: Work with the researchers in leading LLM companies to understand the training data needs for the next generation of LLMs, in the domain of coding skills, or advanced Maths, or Robotics;Implementation for data: Work with the internal R&D team and engineers to design the process and system that can generate the needed training data in the most effective ways;Data Quality and Throughput: Work with internal operational leaders to design a scalable process that can leverage the knowledge of hundreds of knowledge workers assisted with existing LLM capability to build high quality data efficientlyQualifications:Technical Expertise:You need to have a very strong technical foundationStrong background in coding, software development, or related fields.Proficiency across multiple programming languages with deep expertise across at least one of the following: Python, Java Script, Java, ReactExperience with coding languages and environments, including the ability to review, correct, and explain code effectively.Understanding of the fine tuning data creation workflows, especially for coding tasks, is a plus.Fast-iteration and fast-learning attitude: You need to iterate fast with the leading LLM researchers in this cutting-edge spaceComfortable to work in a highly iterative pattern with the researchers and engineers – there won't be a quarterly plan because we are exploring the unknown. Learn quickly into the depth of LLM training domain – even though you may have taken classes or ran projects of machine learning, the LLM training domain has been rapidly changing. Every month, new sub-domain appears and new depth appearsLearn quickly into a new domain that you don't know – it is fascinating how much software engineering knowledge can help advance AGI: coding, reasoning, planning, maths, physics, chemistry… you need to be able to learn a new domain quickly (with the help of chatGPT)!Communication and collaborationBrainstorm with researchers on what is the best dataset to develop a certain capability for the next generation of LLMsTranslate the research ideas into the fastest step-by-step way to iterateUnderstand and verify the operational plan Follow up with researchers on how to use the data in LLM trainings and how to measure the effectiveness of the dataset Ownership & Urgency:A high sense of ownership and responsibility for training data quality, which determines the LLM's performance;Strong problem-solving skills, with the ability to think critically and act quickly when needed.Interpersonal Skills:Exceptional interpersonal and communication skills, with the ability to work well with diverse teams and clients.Ability to motivate and inspire a team, fostering a positive and productive work environment.What We Offer:Opportunity to play a key role in advancing AGI through high-quality training data;A career path that leads to multiple possibilities: applied AI research / engineers, or engineers /architects on large-scale GenAI deployment, or senior people management.A collaborative and innovative environment; A dynamic work culture with opportunities for growth and leadership with Competitive compensation package.