Pithub: How to commit a new file via the Github v3 API

19 Jul 2011

In my last article I was demonstrating how to create a new download using Pithub. Today I will show you how to commit a new file to a repository. Github provides a low level API, which can be used to work directly with git objects like blobs, trees and commits. Committing a file needs several steps:

  1. create a new blob with the content of the file
  2. get the SHA the current master branch points to
  3. fetch the tree this SHA belongs to
  4. create a new tree object with the new blob, based on the old tree
  5. create a new commit object using the new tree and point its parent to the current master
  6. finally update the heads/master reference to point to the new commit

You can also read up the explanation in different words on the Github API documentation. If you want to read more about the git internals and how everything works together, please check the Pro Git book, Chapter 9: Git Internals.

Ready? One note before: I’m skipping some validation code here, just to make it easier to read. In the full example script the validation code is present. Let’s take a look at step 1: create a new blob with the content of the file:

#!/usr/bin/env perl
use strict;
use warnings;
use File::Slurp;
use Pithub::GitData;

my $git = Pithub::GitData->new(
    repo  => 'Pithub',
    token => 'your secret token',
    user  => 'plu',
);

my $content = File::Slurp::read_file(__FILE__);

# the encoding can also be 'base64', if necessary
my $blob = $git->blobs->create(
    data => {
        content  => $content,
        encoding => 'utf-8',
    }
);

That was easy. We just used File::Slurp to read the content of the current file and create a new blob with that content. The encoding is set to utf-8 because the whole content can be represented in utf-8 without loosing any information. If you want to create a binary file, you need to encode the content in base64 and set the encoding accordingly.

The next steps are to get the SHA the current master branch points to and the tree this SHA belongs to:

my $master = $git->references->get( ref => 'heads/master' );
my $base_commit = $git->commits->get( sha => $master->content->{object}{sha} );

So $base_commit represents the last commit in the master branch. It also has the information (SHA) to which tree this commit belongs to. This SHA is necessary to create a new tree object (see the base_tree key):

my $tree = $git->trees->create(
    data => {
        base_tree => $base_commit->content->{tree}{sha},
        tree      => [
            {
                path => 'examples/gitdata_commit.pl',
                mode => '100755',
                type => 'blob',
                sha  => $blob->content->{sha},
            }
        ],
    }
);

Once the new tree has been created, we can finally create a new commit and point the master branch (or better: the heads/master reference) to it:

my $commit = $git->commits->create(
    data => {
        message => 'Add examples/gitdata_commit.pl.',
        parents => [ $master->content->{object}{sha} ],
        tree    => $tree->content->{sha},
    }
);

my $reference = $git->references->update(
    ref  => 'heads/master',
    data => { sha => $commit->content->{sha} }
);

This example gitdata_commit.pl really committed itself. In the current version you’ll notice a few small differences, because I have changed it since then.