Beyond the Algorithm: Cohere 4 AI, LLM Trends, and ML Data Mastery
Podcast
November 13, 2024
Beyond the Algorithm: Cohere 4 AI, LLM Trends, and ML Data Mastery

The Apption Data Podcast - Episode 6

Looking beyond the algorithm, what do you see? In this episode, Anthony Susevski, Data Scientist at RBC Capital Markets and Aya Expanse Ambassador with Cohere for AI, expands on exciting AI initiatives, AI in the workplace, LLM industry trends, and data quality for machine learning. Learn why AI won't be taking jobs, but how it will help professionals reach new heights when it comes to efficiency gains in the workplace.

Journey and background

Anthony's journey began at the University of Waterloo, where he studied economics and mathematics. Anthony discovered data science through conversations with friends who highlighted its exciting potential and promising career prospects. Anthony also worked as a research assistant during university and later joined RBC as a co-op student, which led to his full-time role. Anthony's engagement with Cohere for AI started after pursing opportunities to explore data quality and contribute coding examples, driven by an active involvement in various tech communities.

Cohere for AI, exciting initiatives, and enhancing the AI community

Anthony highlights his involvement with Cohere for AI, where he worked on coding projects and data quality initiatives. He emphasizes Cohere’s commitment to engaging with community-driven research and supporting the community through events and collaborative projects. This involvement showcases Cohere's distinct approach in fostering a supportive ecosystem for AI enthusiasts.

Anthony expands on a few exciting initiatives, such as:

  1. AYA Expanse and the #MysteryBot

  2. Expedition AYA

  3. Research paper publications - MAYA (multi-modal multilingual LLM)

AI in the workplace

AI has been transformative in workspaces and is making waves across many industries. Anthony stressed that current AI technology is not set to replace jobs but rather enhance efficiency. In the banking industry, for instance, AI helps teams quickly draft content and conduct competitive analyses, providing a productivity boost even if the outputs aren't always perfect.

AI is revolutionizing many industries, such as customer service, enabling efficient call triaging. Vision language models are enhancing OCR-based tasks, promising improvements in fields like insurance. However, Anthony noted that these models must become more energy-efficient to mitigate their environmental impact. Looking ahead, Anthony explains that AI’s efficiency gains are significant, maintaining human oversight and balancing technological advancements with sustainability are crucial for the future.

LLM industry trends

LLMs and GenAI are all the rage right now and there are many emerging trends in large language models (LLMs). Anthony highlighted the push toward smaller, more efficient models that can handle specific tasks, such as OCR for insurance, to optimize performance and reduce energy use. Instead of constantly training new models, the industry should be shifting to making better use of existing ones through engineering innovations like test-time optimization and strategic sampling techniques, which enhance performance without heavy retraining.

Anthony also noted that training data saturation has pushed developers to explore smarter engineering approaches as most LLMs have already been pre-trained on vast internet content. While competition among major AI players remains fierce, the focus should be moving towards sustainable and cost-effective solutions that leverage existing models more efficiently.

Data quality mastery for machine learning

Anthony underscores the critical role of data quality in building successful ML models, especially in regulated industries like banking. In Anthony's opinion, developers should spend around 90% of their time in this stage building thorough evaluations with their data. This process involves tedious work of reading through data and constructing evaluations to ensure the data is of the highest quality. While this effort extends beyond data quality alone, ensuring high data quality is essential for developing reliable and effective LLM applications.

Anthony recounts his experience (pre ChatGPT) tackling projects where data inconsistencies posed challenges and how diligent data examination and manual labeling were essential to overcoming these issues. With the advancements in LLMs such as OpenAI's ChatGPT, Anthony emphasizes that this manual data examination and labelling is now a thing of the past.

The future of AI

Anthony envisions a future where AI serves as an indispensable tool for efficiency without replacing human roles, stressing that the human-in-the-loop approach remains vital for verification and maintaining accuracy. LLMs are bridging communication gaps and democratizing education and access to technology. AI tools will continue to enhance learning and productivity and now is the time to take the next step in AI adoption.

As LLMs continue to evolve and reshape industries, the focus must remain on strategic innovation, data quality, and sustainable practices. Insights from experts like Anthony Susevski highlight the importance of balancing efficiency, accuracy, and ethical responsibility. The future of AI development will be driven not only by technological breakthroughs but also by thoughtful implementation that considers environmental impact, data integrity, and the enhancement of human expertise. By adopting these principles, businesses and developers can harness the full potential of LLMs to create more effective solutions.

Listen to the full episode here!

Connect with Anthony Susevski on X, LinkedIn, or through Linktree.

Written By: Lauren Farrell
Related Articles
Join our newsletter.
All the data news you need. Every quarter.