Skip to main content

Git Submodules Tutorial

Git submodules allow you to keep a Git repository as a subdirectory of another Git repository. This is useful when you want to incorporate external code into your project while keeping it separate and independently versioned.

What You'll Learn

  • What Git submodules are and why they're useful
  • How to add, update, and remove submodules
  • Best practices for working with submodules
  • Common issues and troubleshooting

Prerequisites

  • Basic knowledge of Git
  • Git installed on your system
  • A GitHub account (or access to another Git hosting service)

Step 1: Understanding Git Submodules

Submodules are Git repositories nested inside another repository. They allow you to:

  • Include external code in your project
  • Keep track of specific versions of dependencies
  • Make changes to dependencies while keeping them separate from your main project

When you add a submodule to your repository, Git stores:

  • A reference to a specific commit in the submodule
  • The URL from which the submodule can be cloned
  • The path where the submodule should be placed

Step 2: Adding a Submodule

To add a submodule to your repository, use the git submodule add command:

git submodule add <repository-url> [path]

Example:

# Add a submodule to your project
git submodule add https://github.com/username/library.git external/library

This command:

  1. Clones the repository at https://github.com/username/library.git
  2. Places it in the external/library directory
  3. Creates a .gitmodules file if it doesn't exist
  4. Adds the submodule configuration to .gitmodules

The .gitmodules File

After adding a submodule, Git creates or updates a .gitmodules file that maps the submodule's remote URL to the local path:

[submodule "external/library"]
path = external/library
url = https://github.com/username/library.git

Important: The .gitmodules file should always be committed to your repository and should never be added to .gitignore.

Step 3: Cloning a Repository with Submodules

When you clone a repository that contains submodules, the submodule directories are created but no content is pulled by default.

Method 1: Two-step process

# Clone the main repository
git clone https://github.com/username/main-project.git

# Initialize and update submodules
cd main-project
git submodule init
git submodule update

Method 2: One-step process

# Clone and initialize submodules in one command
git clone --recurse-submodules https://github.com/username/main-project.git

Step 4: Working with Submodules

Updating Submodules

To update a submodule to the latest commit on its remote:

# Navigate to the submodule directory
cd external/library

# Fetch and checkout the latest changes
git fetch
git checkout origin/main # or whatever branch you want

# Go back to the parent repository
cd ../..

# Update the reference in the parent repository
git add external/library
git commit -m "Update library submodule to latest version"

Updating All Submodules at Once

# Update all submodules to their latest remote versions
git submodule update --remote

Making Changes to a Submodule

When you need to modify code in a submodule:

# Navigate to the submodule
cd external/library

# Create a branch for your changes
git checkout -b my-feature

# Make changes, commit them
git add .
git commit -m "Add new feature"

# Push to the submodule's remote
git push origin my-feature

# Go back to the parent repository
cd ../..

# Update the reference in the parent repository
git add external/library
git commit -m "Update library submodule to include my-feature"

Step 5: Removing a Submodule

Removing a submodule is a bit more involved:

# Remove the submodule entry from .git/config
git submodule deinit -f path/to/submodule

# Remove the submodule from the index and working tree
git rm -f path/to/submodule

# Remove the submodule's .git directory
rm -rf .git/modules/path/to/submodule

# Commit the changes
git commit -m "Remove submodule"

Working with Submodules on a Different Machine

When you need to work with a repository containing submodules on a different machine (such as switching between work and home computers), you need to follow specific steps to properly initialize and update the submodules.

Initial Setup on a New Machine

There are two approaches to clone a repository with submodules:

Method 1: Clone and Initialize in Separate Steps

# Step 1: Clone the main repository
git clone https://github.com/username/main-project.git
cd main-project

# Step 2: Initialize the submodule configuration
git submodule init

# Step 3: Fetch the submodule contents
git submodule update

This three-step process:

  1. Clones the main repository without submodule content
  2. Initializes the submodule configuration from .gitmodules into your local .git/config
  3. Fetches the submodule content at the specific commit the main repository is tracking
# Clone and initialize submodules in one step
git clone --recurse-submodules https://github.com/username/main-project.git

This is more efficient and ensures all submodules are properly initialized and updated in a single command.

Working with Submodules After Initial Setup

After cloning, when you want to work on a submodule, follow these steps:

  1. Navigate to the submodule directory:

    cd path/to/submodule
  2. Check the status and branch:

    git status
    git branch

    You might notice you're in a "detached HEAD" state. This happens because the submodule points to a specific commit, not a branch.

  3. Switch to a proper branch:

    git checkout main  # or whatever the primary branch is
  4. Get the latest changes:

    git pull origin main
  5. Make your changes to the submodule

  6. Commit and push from within the submodule directory:

    git add .
    git commit -m "Your changes to the submodule"
    git push origin main
  7. Update the main repository to point to the new commit:

    cd ..  # Go back to main repository
    git add path/to/submodule # This updates the commit reference
    git commit -m "Update submodule reference"
    git push

Keeping Submodules Updated on All Machines

When you pull changes in the main repository that include submodule updates:

# Pull changes in the main repository
git pull

# Update all submodules to match
git submodule update --init --recursive

The --init flag ensures that any new submodules are initialized, and --recursive handles nested submodules.

Common Issues When Working Across Multiple Machines

  1. Empty submodule directories: If you forgot to run git submodule update after cloning or pulling.

  2. Detached HEAD in submodules: This is normal when first updating submodules. If you want to make changes, remember to check out a branch first.

  3. Conflicts in submodule references: If you've modified a submodule on one machine and try to pull changes from another machine that also modified the same submodule.

  4. Forgetting to push submodule changes: Always push changes from within the submodule directory before updating and pushing the main repository.

By following these steps, you can effectively work with Git repositories containing submodules across multiple machines.

Real-World Example: Adding Source Code to a Documentation Project

Let's walk through a practical example of using submodules to manage source code for a documentation project.

Scenario

You're maintaining a documentation site for a Python library, and you want to include the source code as a submodule so that:

  1. Code examples in the documentation always match the actual code
  2. You can update the documentation and code independently
  3. Users can easily find the source code

Step-by-Step Solution

1. Create the Source Code Repository (if it doesn't exist)

# Create a new directory for your source code
mkdir python-library-source
cd python-library-source

# Initialize Git repository
git init
echo "# Python Library Source Code" > README.md
git add README.md
git commit -m "Initial commit with README"

# Add your source files
# ... (add your Python files)
git add .
git commit -m "Add initial source code"

# Create repository on GitHub and push
git branch -M main
git remote add origin https://github.com/username/python-library-source.git
git push -u origin main

# Go back to parent directory
cd ..

2. Add the Source Code as a Submodule to Your Documentation Project

# Navigate to your documentation project
cd documentation-project

# Add the source code repository as a submodule
git submodule add https://github.com/username/python-library-source.git source-code

# Commit the changes
git add .gitmodules source-code
git commit -m "Add source code as submodule"
git push

3. Reference Source Code in Documentation

In your Markdown documentation files, you can now reference specific files from the submodule:

See the implementation in [`source-code/library/core.py`](https://github.com/username/python-library-source/blob/main/library/core.py).

4. Updating the Source Code

When you update the source code:

# Navigate to the source code submodule
cd source-code

# Make changes
# ... (edit files)

# Commit and push changes
git add .
git commit -m "Update implementation"
git push

# Go back to documentation project
cd ..

# Update the submodule reference
git add source-code
git commit -m "Update source code reference"
git push

Common Issues and Troubleshooting

Empty Submodule Directory

Problem: After cloning a repository, submodule directories are empty.

Solution: Initialize and update the submodules:

git submodule init
git submodule update

"You need to resolve your current index first" Error

Problem: You get this error when trying to update a submodule.

Solution: Commit or stash changes in your main repository first:

git stash
git submodule update
git stash pop

Submodule at Wrong Commit

Problem: A submodule is pointing to an old commit.

Solution: Update the submodule:

cd path/to/submodule
git fetch
git checkout origin/main
cd ..
git add path/to/submodule
git commit -m "Update submodule to latest commit"

"fatal: remote error: upload-pack: not our ref" Error

Problem: This can happen when trying to push a submodule.

Solution: Make sure you're on a branch in the submodule:

cd path/to/submodule
git checkout -b my-branch
# or
git checkout main

Best Practices

  1. Always commit .gitmodules: This file should be tracked in version control.

  2. Use specific commits: Point submodules to specific commits rather than branches for stability.

  3. Document submodule usage: Include instructions for submodule initialization in your README.

  4. Consider alternatives: For simple dependencies, consider package managers instead of submodules.

  5. Use shallow clones for large repositories:

    git submodule add --depth 1 <repository-url> [path]
  6. Keep submodules updated: Regularly update submodules to get security fixes and new features.

Conclusion

Git submodules provide a powerful way to include external code in your projects while maintaining separation and version control. While they can be complex to manage, following the best practices in this tutorial will help you use them effectively.

Further Reading