Skip to main content Link Menu Expand (external link) Document Search Copy Copied

Meeting Summary: CS 486-686 Lecture/Lab (Spring 2025)

Date: February 27, 2025
Time: 08:08 AM Pacific Time (US and Canada)
Meeting ID: 893 0161 6954


Quick Recap

In the session, Greg addressed several key topics:

  • Logistics & Schedule:
    • Discussed upcoming tasks and the class schedule that includes paper discussions, extended content on ForPaPy, and a transition to agent-related topics.
  • Project and Coding Concerns:
    • Addressed student decisions on working individually or in groups.
    • Tackled coding issues and shared troubleshooting steps for embedding models.
  • Performance and Experimentation:
    • Compared embedding models, specifically Llama versus OpenAI.
    • Explored the potential of local embedding models and alternatives (e.g., Ginger Embeddings, Transformers, and Voyage AI).
  • Future Directions:
    • Discussed plans for using AI embeddings and integrating the Alama model.
    • Raised the importance of metadata (e.g., source file, function name, lines of code) for improving retrieval quality.

Next Steps

Both students and Greg have specific action items to complete:

For Students

  • Install Ollama:
    Install Ollama on their laptops.

  • Finalize Repository Setup:
    Pull the in-class repository and run the example using the Ollama embedding model locally.

  • Project Work Mode:
    Inform Greg by the end of the day whether they will work individually or as a team.

  • Prepare for RAG Discussion:
    Read and prepare to discuss the RAG (Retrieval-Augmented Generation) paper for Tuesday’s class.

  • Paper Preparation:
    Create a paper02.md file based on the provided template for the second paper discussion.

  • Investigate the Code Splitter:
    Understand how the code splitter functions and verify its context-providing capabilities.

  • Review RAG Example:
    Examine a linked example of a similar RAG system for code search.

  • Enhance RAG System:
    Consider adding metadata (e.g., source file, function name, lines of code) to improve the base RAG system.

  • Tune RAG Parameters:
    Think about effective parameter settings for the RAG system.

For Greg

  • Website Update:
    Update the website with past content and include materials for week 4.

Below is a Mermaid diagram that outlines the workflow for these next steps:

flowchart TD
    A[Install Ollama on Laptop] --> B[Pull In-Class Repository]
    B --> C[Run Example with Ollama Embedding Model]
    C --> D[Decide: Team or Individual Project]
    D --> E[Notify Greg by End-of-Day]
    E --> F[Read and Prepare for RAG Paper Discussion]
    F --> G[Create paper02.md File Using Template]
    G --> H[Investigate Code Splitter Functionality]
    H --> I[Review RAG System Example for Code Search]
    I --> J[Consider Metadata Enhancements in RAG System]
    J --> K[Tune Parameters for RAG System]
    
    subgraph Website Update
      L[Greg Updates Website with Past Content & Week 4 Materials]
    end

Detailed Summary

Upcoming Tasks and Class Schedule

  • Logistics:
    Greg reviewed logistical matters and provided an overview of the upcoming tasks.

  • Project Decisions:
    Students must decide by the end of the day whether they will work individually or in groups.

  • Learning Materials:
    The website will be updated with new and past materials, and a second paper has been assigned for Tuesday’s discussion on RAG.

  • Upcoming Topics:
    Future sessions will cover agent discussions, fine-tuning, and preparation for the final group project.

  • Coding Troubleshooting:
    Resolution of previous coding issues was discussed.

Embedding Models Performance and Llama

  • Model Focus:
    The discussion centered on the performance of embedding models, with a focus on using Llama versus OpenAI.

  • Local Execution:
    Emphasis was placed on the benefits of running models locally rather than relying on external APIs.

  • Performance Concerns:
    Although Llama was installed on a local machine, concerns were raised regarding its performance impact.

  • Exploration of Alternatives:
    Alternatives such as Ginger Embeddings and Transformers were considered, along with the potential use of Voyage AI.

The following Mermaid diagram illustrates the embedding models assessment:

flowchart TD
    A[Llama] --> B[Local Execution]
    A --> C[Performance Concerns]
    D[OpenAI] --> E[External API Reliance]
    F[Ginger Embeddings & Transformers] --> G[Alternative Options]
    H[Voyage AI] --> I[Embedding Experimentation]

Python Program Troubleshooting and Setup

  • Development Process:
    Greg developed a Python program to interact with an embedding API.

  • Encountered Issues:
    • Incorrect URL endpoints.
    • Improper payload formats.
  • Resolution Steps:
    Adjustments were made to the code by fixing the model name and correcting the embedding endpoint. Once modified, the API provided the expected response.

  • Next Steps:
    All students are expected to set up and run Llama locally following the troubleshooting process.

Llama Integration and Performance Discussion

  • System Integration:
    Greg detailed the process of connecting Llama with the Alama embedding API.

  • Performance Analysis:
    Performance differences were noted between computer types, including Apple Silicon and Intel systems.

  • Local Setup Advancements:
    Initiatives to use Llama Index for local utilization of the Alama embedding model were discussed.

  • Graphics Note:
    Earlier systems used integrated Intel graphics rather than the GPU, impacting performance.

Exploring Local Embedding Models and Alternatives

  • Experimentation Goals:
    Utilizing local embedding models aims to reduce API calls while facilitating experimentation.

  • Benchmarking:
    Different models were compared for speed and efficiency against OpenAI’s solution.

  • Voyage AI Integration:
    Greg set up the necessary API key and packages to test Voyage AI as an alternative.

  • Collaborative Efforts:
    The group shared insights and troubleshooting tips for implementing various embedding models.

Embedding Models and Sequence Limits

  • Challenges:
    The session highlighted challenges such as rate limits and sequence limits impacting model performance.

  • Proposals for Improvement:
    • Use AI embeddings.
    • Evaluate the Alama model.
    • Improve retrieval quality by incorporating metadata (file path, function name, code lines).
  • Investigation:
    Students were tasked to explore the functionality of the code splitter and assess its contextual accuracy.

This flowchart outlines the process of addressing embedding model challenges:

flowchart TD
    A[Rate Limits & Sequence Issues] --> B[Performance Challenges]
    B --> C[Proposal: Use AI Embeddings]
    C --> D[Evaluate Alama Model]
    D --> E[Enhance with Metadata]
    E --> F[Investigate Code Splitter Effectiveness]

Conclusion

The meeting concentrated on practical logistics, upcoming tasks, and technical challenges related to embedding models and system integrations. Greg provided thorough instructions and troubleshooting strategies, establishing a clear roadmap for both individual tasks and collaborative efforts. This approach aims to enhance the efficiency and performance of the course’s embedding systems while fostering proactive experimentation.

References