Skip to content

Local Installation of Multiple LLMs

    By: Husam Yaghi

    Here is a step-by-step tested installation of multiple LLMs on your local machine with access to your defined local dataset; all through a single chat interface with a dropdown menu to select the desired LLM. My PC has only one GPU. I installed and ran: deepseek-r1:32B, Mistrral, Qwen, Gemma, and llama3.3

    Step 1: Ensure System Requirements

    Before proceeding, confirm the following:

    • Operating System: Windows 10/11 or macOS.
    • Python: Install Python 3.9 or higher from python.org.
    • NVIDIA GPU (optional):
      • If the user wants to run LLMs locally on their GPU, ensure they have an NVIDIA GPU with CUDA support.
      • Install the CUDA Toolkit (version 11.8) and cuDNN if using PyTorch with GPU acceleration.

    Step 2: Install Python

    1. Download Python 3.9 or higher from org.
    2. During installation:
      • Check the box “Add Python to PATH”.
      • Choose the option to install pip (Python’s package manager).
    3. After installation, verify Python is installed by running:

                  python –version

    If python doesn’t work, try:

                  python3 –version

    Step 3: Install Required Libraries

    The script requires multiple Python libraries. Run the following commands in a terminal (Command Prompt, PowerShell, or macOS Terminal):

    Command to Install All Required Libraries:

     pip install torch torchvision torchaudio –index-url https://download.pytorch.org/whl/cu118

     

    pip install transformers datasets PyPDF2 python-docx openpyxl python-pptx SpeechRecognition pyttsx3 psutil

     

    Optional Debugging Commands (If Installation Fails):

    If the above command fails, install individual libraries:

               pip install PyPDF2

               pip install python-docx

               pip install openpyxl

               pip install python-pptx

               pip install SpeechRecognition

              pip install pyttsx3

             pip install psutil

            pip install transformers datasets

    # (For GPU users only)

            pip install torch torchvision torchaudio –index-url https://download.pytorch.org/whl/cu118

    Step 4: Install Ollama

    1. Download and install Ollama from https://ollama.ai.
    2. After installation, verify that Ollama works by running:

                  ollama list

    Step 5: Download Relevant LLM Models

    Run the following commands in your terminal to download the required models locally:

                  ollama pull qwen

                  ollama pull deepseek-r1:32B

                 ollama pull mixtral

                 ollama pull gemma

                ollama pull llama3.3

    Step 6: Organize the Directory Structure

    Create the following directory structure on your PC:

    plaintext

    C:

    ├── deepseek

    │   ├── main_gui.py        # Place the script here

    │   ├── cleanup.py         # Optional system cleanup script

    │   ├── prepare.py         # Optional script for auto-installing.

    │   ├── …

    └── testDataSet           # Place datasets here

        ├── file1.pdf

        ├── file2.docx

        ├── file3.xlsx

        ├── file4.pptx

    Step 7: Run the Script

    1. Open a terminal (Command Prompt, PowerShell, or macOS Terminal).
    2. Navigate to the directory containing py:

                       cd C:deepseek

    1. Run the script:

                      python main_gui.py

    Step 8: Interact with the GUI

    Once the GUI launches:

    1. Ask Questions: Type a question in the input field and choose a model from the dropdown.
    2. Voice Input: Click the “Voice Input” button to use your microphone.
    3. Upload Files: Add context by uploading PDFs, Word documents, Excel files, or PowerPoint slides.
    4. Save History: Save your question-response history using the “Save History” button.
    5. Clear Context: Clear all previously uploaded context with the “Clear Context” button.

    Step 9: Optional Steps

    System Cleanup (Optional)

    Run cleanup.py to free up GPU memory, clear caches, and kill unnecessary processes:

                    python cleanup.py

    Verify GPU Support

    If using GPU, verify PyTorch can access your GPU:

    python -c “import torch; print(torch.cuda.is_available())”

    • If True, PyTorch is configured to use your GPU.

    Troubleshooting

    1. No GPU Detected:
      • Ensure the NVIDIA GPU drivers, CUDA Toolkit, and cuDNN are installed.
      • Reinstall PyTorch with GPU support:

                  pip uninstall torch torchvision torchaudio

                  pip install torch torchvision torchaudio –index-url https://download.pytorch.org/whl/cu118

    1. Missing Libraries:
      • Install missing libraries using:

                 pip install <library_name>

    1. Performance Issues:
      • Use a smaller model (e.g., deepseek-1.5Binstead of deepseek-32B).
      • Free system resources by running py.
    2. Ollama Model Not Found:
      • Ensure the model is downloaded:

                  ollama pull <model_name>

     


     

    Here is the complete Python script:

    main_gui.py

    make sure you have downloaded locally all the relevant LLMs which are using:
    in a Power Shell execute:
    ollama pull qwen
    ollama pull deepseek-r1:32B
    ollama pull mistral
    ollama pull gemma
    ollama pull llama3.1
    ollama pull phi4
    ollama pull qwen2.5
    ollama pull codellama

     


    import tkinter as tk
    from tkinter import scrolledtext, filedialog
    import subprocess
    import threading
    import speech_recognition as sr
    import pyttsx3
    import os
    import glob
    import PyPDF2
    import docx
    import pandas as pd
    history = []
    context_history = ""
    engine = pyttsx3.init()
    LOCAL_DATASET_DIR = "D:yaghiDataSet"
    def clear_context(result_text):
    global context_history
    context_history = context_history[-5000:] # Keep only the last 5,000 characters
    result_text.insert(tk.END, "Context cleared. You can now start a new topic.nn", "info")
    def save_history():
    file_path = filedialog.asksaveasfilename(defaultextension=".txt", filetypes=[("Text Files", "*.txt")])
    if file_path:
    with open(file_path, "w") as file:
    for entry in history:
    file.write(f"Question: {entry['question']}n")
    file.write(f"Response: {entry['response']}nn")
    def upload_files(result_text):
    try:
    file_paths = filedialog.askopenfilenames(
    title="Select Files",
    filetypes=[("PDF Files", ".pdf"), ("Word Files", ".docx"), ("Excel Files", ".xlsx"), ("All Files", ".*")]
    )
    if not file_paths:
    result_text.insert(tk.END, "No files selected.nn", "info")
    return
    global context_history for file_path in file_paths: file_name = os.path.basename(file_path) extracted_content = "" if file_path.endswith(".pdf"): with open(file_path, "rb") as pdf_file: reader = PyPDF2.PdfReader(pdf_file) for page in reader.pages: extracted_content += page.extract_text() + "n" elif file_path.endswith(".docx"): doc = docx.Document(file_path) for para in doc.paragraphs: extracted_content += para.text + "n" elif file_path.endswith(".xlsx"): df = pd.read_excel(file_path) extracted_content = df.to_string() else: result_text.insert(tk.END, f"Unsupported file format: {file_name}nn", "error") continue context_history += extracted_content + "n" result_text.insert(tk.END, f"Successfully added content from: {file_name}nn", "info") except Exception as e: result_text.insert(tk.END, f"An error occurred during file upload: {str(e)}nn", "error")

    def load_local_dataset():
    global context_history
    try:
    # Check if the directory exists
    if not os.path.exists(LOCAL_DATASET_DIR):
    return "Error: Local dataset directory does not exist."
        # Read all text files, PDFs, Word documents, or other supported formats
        dataset_content = ""
        for file_path in glob.glob(f"{LOCAL_DATASET_DIR}/*"):
            file_name = os.path.basename(file_path)
            print(f"Processing file: {file_name}")  # Debug: Log file name
            if file_path.endswith(".txt"):
                try:
                    with open(file_path, "r", encoding="utf-8") as file:
                        dataset_content += file.read() + "n"
                except UnicodeDecodeError:
                    # Fallback to another encoding if UTF-8 fails
                    with open(file_path, "r", encoding="latin1") as file:
                        dataset_content += file.read() + "n"
            elif file_path.endswith(".pdf"):
                with open(file_path, "rb") as pdf_file:
                    reader = PyPDF2.PdfReader(pdf_file)
                    for page in reader.pages:
                        dataset_content += page.extract_text() + "n"
            elif file_path.endswith(".docx"):
                doc = docx.Document(file_path)
                for para in doc.paragraphs:
                    dataset_content += para.text + "n"
            elif file_path.endswith(".xlsx"):
                df = pd.read_excel(file_path)
                dataset_content += df.to_string() + "n"
            else:
                print(f"Unsupported file format: {file_name}")  # Debug: Log unsupported files
                dataset_content += f"Unsupported file format: {file_name}n"
        # Add the dataset content to the global context_history
        context_history += dataset_content
        print(f"Dataset content loaded (size: {len(dataset_content)} characters)")  # Debug: Log content size
        return "Successfully loaded local dataset."
    except Exception as e:
        return f"Error loading local dataset: {str(e)}"

    def voice_input(question_entry, result_text):
    recognizer = sr.Recognizer()
    with sr.Microphone() as source:
    result_text.insert(tk.END, "Listening…n", "info")
    try:
    audio = recognizer.listen(source)
    question = recognizer.recognize_google(audio)
    result_text.insert(tk.END, f"Recognized Question: {question}n", "info")
    question_entry.insert(0, question)
    except sr.UnknownValueError:
    result_text.insert(tk.END, "Could not understand the audio.n", "error")
    except sr.RequestError as e:
    result_text.insert(tk.END, f"Speech Recognition Error: {str(e)}n", "error")

    def speak_response(response):
    def tts():
    global engine
    try:
    if engine._inLoop: # Check if the engine is already running
    engine.endLoop() # End the current loop if necessary
    engine.say(response)
    engine.runAndWait()
    except RuntimeError:
    pass # Handle any runtime errors gracefully
    tts_thread = threading.Thread(target=tts)
    tts_thread.daemon = True  # Ensure the thread terminates with the main program
    tts_thread.start()

    def on_llm_submit(question_entry, result_text, use_local_llm, selected_llm):
    question = question_entry.get().strip()
    if not question:
    result_text.insert(tk.END, "Error: Question cannot be empty.nn", "error")
    return
    result_text.insert(tk.END, f"Question: {question}n", "question")
    llm_choice = selected_llm.get()
    # Set the command based on the selected LLM
    if llm_choice == "DeepSeek-r1:32B":
        command = ["ollama", "run", "deepseek-r1:32b"]
    elif llm_choice == "DeepSeek-r1:1.5B":
        command = ["ollama", "run", "deepseek-r1:1.5b"]
    elif llm_choice == "Mistral (Latest)":  # Add this block
        command = ["ollama", "run", "mistral:latest"]
    elif llm_choice == "Mistral:7B":
        command = ["ollama", "run", "mistral:7b"]
    elif llm_choice == "Llama3:70B":
        command = ["ollama", "run", "llama3:70b", "--cpu"]
    elif llm_choice == "Llama3.3":
        command = ["ollama", "run", "llama3.3:latest"]
    elif llm_choice == "Phi4":
        command = ["ollama", "run", "phi4:latest"]
    elif llm_choice == "Qwen2.5":  # Add this block
        command = ["ollama", "run", "qwen2.5"]
    elif llm_choice == "Gemma":
        command = ["ollama", "run", "gemma:latest"]
    else:
        result_text.insert(tk.END, f"Error: Unknown LLM selected: {llm_choice}n", "error")
        return
    # Debug: Show the selected LLM and command in the GUI
    result_text.insert(tk.END, f"Using model: {llm_choice}n", "info")
    # Load the local dataset if "Use Local Dataset" is selected
    if use_local_llm.get():
        result_text.insert(tk.END, "Using local dataset for the query.n", "info")
        dataset_status = load_local_dataset()  # Load and update context_history
        result_text.insert(tk.END, f"{dataset_status}n", "info")
    # Run the LLM in a separate thread
    thread = threading.Thread(
        target=run_llm, 
        args=(command, question, result_text, llm_choice)  # Pass only 4 arguments
    )
    thread.start()

    def run_llm(command, question, result_text, llm_name):
    global context_history
    try:
    # Ensure context history size is manageable
    context_history = context_history[-5000:] # Keep only the last 5,000 characters
    full_input = f"{context_history}nQuestion: {question}"
    print(f"Context size: {len(context_history)} characters") # Debug: Log context size
    print(f"Input to {llm_name}:n{full_input}") # Debug: Log model input
        # Run the LLM command
        process = subprocess.Popen(
            command,
            stdin=subprocess.PIPE,
            stdout=subprocess.PIPE,
            stderr=subprocess.PIPE,
            text=True,
            encoding="utf-8",
        )
        stdout, stderr = process.communicate(input=full_input)
        # Process the model's output
        if process.returncode == 0:
            response = stdout.strip()
            if response:
                result_text.insert(tk.END, f"Response from {llm_name}:n{response}nn", "response")
                speak_response(response)
                context_history += f"Question: {question}nResponse from {llm_name}: {response}nn"
                history.append({"question": question, "response": response})
            else:
                response = "I'm sorry, I couldn't find any information about Husam Yaghi."
                result_text.insert(tk.END, f"Fallback Response: {response}nn", "response")
        else:
            result_text.insert(tk.END, f"Error from {llm_name}:n{stderr.strip()}nn", "error")
    except FileNotFoundError:
        result_text.insert(tk.END, f"Error: The command '{command[0]}' was not found. Is 'ollama' installed?nn", "error")
    except Exception as e:
        result_text.insert(tk.END, f"An unexpected error occurred while using {llm_name}: {str(e)}nn", "error")

    def cleanup_and_exit(root):
    # Stop the pyttsx3 engine
    global engine
    engine.stop()
    # Wait for all non-main threads to finish
    for thread in threading.enumerate():
        if thread is not threading.main_thread():
            thread.join()
    # Destroy the GUI
    root.destroy()

    def create_gui():
    root = tk.Tk()
    root.title("Yaghi's LLM Interface")
    root.configure(bg="#2b2b2b")

    # Bind the close event to cleanup_and_exit root.protocol("WM_DELETE_WINDOW", lambda: cleanup_and_exit(root)) # Title Label question_label = tk.Label(root, text="Yaghi's Local LLM Chat", bg="#2b2b2b", fg="white", font=("Arial", 16)) question_label.pack(pady=5) # LLM Dropdown Frame llm_frame = tk.Frame(root, bg="#2b2b2b") llm_frame.pack(pady=5) llm_label = tk.Label(llm_frame, text="Choose LLM:", bg="#2b2b2b", fg="white", font=("Arial", 12)) llm_label.grid(row=0, column=0, padx=5) selected_llm = tk.StringVar() selected_llm.set("DeepSeek-r1:32B") llm_dropdown = tk.OptionMenu( llm_frame, selected_llm, "Codellama", "DeepSeek-r1:32B", "DeepSeek-r1:1.5B", "Gemma", "Llama3:70B", "Llama3.3", "Mistral", "Mistral (Latest)", "Mistral:7B", "Phi4", "Qwen", "Qwen2.5" ) llm_dropdown.grid(row=0, column=1, padx=5) # Input field and "Use Local Dataset" checkbox in the same frame input_frame = tk.Frame(root, bg="#2b2b2b") input_frame.pack(pady=5) question_entry = tk.Entry(input_frame, width=50, font=("Arial", 12)) question_entry.grid(row=0, column=0, padx=5) use_local_llm = tk.BooleanVar() local_llm_checkbox = tk.Checkbutton( input_frame, text="Use Local Dataset", variable=use_local_llm, bg="#2b2b2b", fg="white", selectcolor="#3b3b3b", font=("Arial", 10) ) local_llm_checkbox.grid(row=0, column=1, padx=5) # Buttons button_frame = tk.Frame(root, bg="#2b2b2b") button_frame.pack(pady=5) tk.Button( button_frame, text="Ask Question", command=lambda: on_llm_submit(question_entry, result_text, use_local_llm, selected_llm), bg="#4caf50", fg="white", font=("Arial", 10), width=12 ).grid(row=0, column=0, padx=5) tk.Button( button_frame, text="Voice Input", command=lambda: voice_input(question_entry, result_text), bg="#2196f3", fg="white", font=("Arial", 10), width=12 ).grid(row=0, column=1, padx=5) tk.Button( button_frame, text="Clear Context", command=lambda: clear_context(result_text), bg="#ffc107", fg="black", font=("Arial", 10), width=12 ).grid(row=0, column=2, padx=5) tk.Button( button_frame, text="Upload Files", command=lambda: upload_files(result_text), bg="#9c27b0", fg="white", font=("Arial", 10), width=12 ).grid(row=1, column=0, padx=5) tk.Button( button_frame, text="Save History", command=save_history, bg="#ff9800", fg="white", font=("Arial", 10), width=12 ).grid(row=1, column=1, padx=5) tk.Button( button_frame, text="Exit", command=lambda: cleanup_and_exit(root), bg="#f44336", fg="white", font=("Arial", 10), width=12 ).grid(row=1, column=2, padx=5) # Result text area result_text = scrolledtext.ScrolledText( root, wrap=tk.WORD, height=30, width=90, font=("Arial", 10), bg="#1e1e1e", fg="white", insertbackground="white" ) result_text.pack(pady=10) result_text.tag_config("question", foreground="#ff9800", font=("Arial", 10, "bold")) result_text.tag_config("response", foreground="#4caf50", font=("Arial", 10)) result_text.tag_config("error", foreground="#f44336") root.mainloop()

    if name == "main":
    create_gui()