Would you like to join a cutting-edge tech project with a positive impact?
From Newtral, Trueflag emerges as a spin-off with the mission of using artificial intelligence (AI) to combat disinformation.
We are looking for a Principal NLP Engineer who combines expertise in handling large datasets with a strong business and customer-oriented mindset.
What do we offer?
* Be part of an innovative project where your work will have a direct positive impact.
* Join a leading engineering team that is diverse and fosters a great working environment.
* 100% remote work from anywhere in Spain, with the option to work from a co-working space in Madrid.
* Off-sites and team-building events.
* Flexible working hours.
* Competitive salary aligned with the market.
As part of our AI team, you will join a group of passionate professionals exploring the potential of natural language generation and understanding. You will work with large language models, text embedding models, custom classifiers, quantization techniques, inference engine deployments, finetunings and multimodality, aiming at detecting fake news and fighting disinformation.
You will collaborate with data scientists, data engineers, MLOps specialists to design the next generation of intelligent systems capable of automated fact-checking.
Your mission will be to develop AI/NLP models and libraries to support the intelligent services and agents we are building.
Requirements:
* Proficient in Python programming with over 7 years of experience.
* In-depth knowledge of machine learning techniques, algorithms, and natural language processing (NLP).
* Skilled in data science libraries and frameworks such as Numpy, Pandas, Scikit-learn, Matplotlib, Spacy (5+ years of experience).
* Expertise in NLP and deep learning frameworks and platforms, including: pytorch, pytorch lightning, huggingface transformers and derived libraries (datasets, PEFT, TRL, bitsandbytes...)
* 3+ years of experience using NLP models like BERT-like and GPT-like architectures. Sentence transformers for semantic textual similarity and clustering.
* Proven ability to fine-tune ML models to optimize accuracy and F1 scores.
* Backend knowledge (FastAPI) and inference engines for LLM deployment (vLLM, ollama)
* Experience in constructing and curating large-scale machine learning datasets.
* Collaborative mindset, working closely with product architecture teams to integrate research and development into products.
Bonus points:
* Experience building Retrieval-Augmented Generation (RAG) models.
* Expertise in designing and implementing vector databases (Pinecone, ElasticSearch, FAISS, etc.).
* Familiarity with multi-agent conversation frameworks and/or multi-agent LLM applications (e.g., AutoGen, CrewAI, AIbitat).
* Hold a PhD in Computer Science, Artificial Intelligence or a similar field.
* -Knowledgeable of CI/CD practices
* - Knowledgeable of bash scripting / Makefiles
* - Docker engine for building images and running containers, prototype deployments through docker compose
Are you ready to take on this challenge?