We are seeking a skilled AI Engineer with strong expertise in Large Language Models (LLMs), fine-tuning workflows (both for text and vision tasks), and Retrieval-Augmented Generation (RAG) techniques. In this role, you will be responsible for designing and implementing our AI pipelines, from data preprocessing and model training to inference and deployment. You will also be a key contributor to integrating these systems with our hardware infrastructure to ensure optimal performance and scalability. The ideal candidate will have experience in implementing large-scale AI projects and a deep understanding of advanced model architectures (e.g., Transformers, multi-modal models), as well as practical knowledge in optimizing models for both cloud and on-premise environments. You will collaborate closely with cross-functional teams to build robust solutions that leverage the latest breakthroughs in AI research to meet business needs at our fast-growing startup. Requirements: • Proven experience fine-tuning LLMs (GPT, T5, BERT-like models, etc.) for real-world applications, including data collection, data cleaning, and hyperparameter tuning. • Hands-on experience with vision-related tasks (e.g., image classification, object detection) and familiarity with popular frameworks (e.g., PyTorch, TensorFlow). • Strong understanding of Retrieval-Augmented Generation (RAG) workflows, including building or integrating vector databases (e.g., Milvus, Pinecone, or FAISS). • Proficiency in Python and relevant AI/ML libraries (PyTorch, TensorFlow, Hugging Face Transformers, etc.). • Experience deploying AI models in production, including working with inference pipelines, containerization (Docker), and orchestration tools (Kubernetes preferred). • Familiarity with hardware accelerators (GPUs, TPUs) and the ability to optimize training/inference workloads based on available compute resources. • Solid knowledge of MLOps best practices, including CI/CD, model versioning, and serving (e.g., MLflow, Kubeflow). • Strong grasp of distributed computing and techniques for large-scale data processing (Apache Spark or Dask is a plus). • Experience with microservices architecture, RESTful APIs, and GraphQL. • Basic understanding of data engineering pipelines and ETL/ELT processes. • Effective communication skills and ability to work in a collaborative team environment. • Bachelor's, Master's, or PhD in Computer Science, Engineering, or a related field is preferred. • Working proficiency in English, as we have team members from around the globe.