A Python script that turns a cloud based storage system into a Git repository. Since the entire repository is uploaded and / or downloaded for each command, it is only recommended for small repos and for single users.

My motivation for writing this script was to have a simple way to keep my Git repository in sync across my multiple machines, while having everything encrypted when in the cloud and without having to run a separate server.

Links

  1. License
  2. Download
  3. The Script
  4. Caveats
  5. Development Branch on Gitlab

License

The MIT License (MIT)

Copyright (c) 2017 Elliott Karpilovsky

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Download

The remote-storage-git.zip file contains the script and tests. Note that dependencies on either the boto or Google API libraries exist, depending on which storage system you use.

The MD5 sum is provided as a quick check that the file downloaded properly. However, it does not validate the download. Verify by using the signature.

The Script

The script provides four commands: init, destroy, push, and pull. To create a new remote repository that is empty, call the “init” command. To delete it, run “destroy”. To push/pull to it, use “push” and “pull” respectively. The remote repository is always stored as a single, encrypted tarfile.

Whenever a pull is done, the script downloads file, unencrypt and unpacks it, then runs a local git pull against the repository (before deleting the temporary one). A “push” operates similarly, except the temporary repository is then tarred, encrypted, and uploaded to the cloud.

The script is configured via a config file, using the INI format.

General Setup

Whether using Amazon S3 or Google Drive, the following steps must be taken:

  1. Configure the key to use for symmetric encryption/decryption. This is stored in the key field in the config. All clients must be configured with the same key. To generate a random key, invoke:
python -c 'import base64; import os; print(base64.standard_b64encode(os.urandom(40)).decode("utf-8")[:-2])'
  1. The gpg binary must be configured and available from the shell. It must support AES256, SHA512, and BZIP2 as the cipher, hash, and compression algorithm.

Amazon S3 Setup

Follow these instructions for configuring the script to use Amazon S3 as the storage mechanism. Please note that these instructions were used on January 16, 2017; since web services tend to change their interfaces on a consistent basis, they may no longer be valid.

  1. Install the boto library, e.g., pip install boto.
  2. Sign up for Amazon S3 service, if you have not already. Note that this will probably cost money.
  3. Create a bucket through the Amazon web interface.
  4. Fill in the bucket name in the bucket field in the config file.
  5. On the Amazon S3 console, click on your user name in the upper right and select “My Security Credentials.” Tell it you want to create a “IAM user”.
  6. Add a user and give it programmatic access.
  7. Create a group and only give it permission to read/write to the bucket you created earlier. You may need to create the group separately, and then create a policy to associate with it. The policy I used is for a bucket called example-bucket:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "s3:*",
      "Resource": "arn:aws:s3:::example-bucket/*"
    },
    {
      "Effect": "Allow",
      "Action": "s3:*",
      "Resource": "arn:aws:s3:::example-bucket"
    }
  ]
}
  1. Add the user to the group.
  2. Go to the user’s page, select your newly created user, and select the “security credentials” tab. Delete the access key. Create a new one to get both the access key and private key. Store them in the config file.
  3. Pick a filename for the file, when it is uploaded.

Google Drive Setup

Follow these instructions for configuring the script to use the Google Drive API. Please note that these instructions were used on January 16, 2017; since web services tend to change their interfaces on a consistent basis, they may no longer be valid.

  1. Follow the instructions on the Python QuickStart guide to create a new project, generate a client_secret.json file, and install the Google API Python client. As part of the process, you will have to create a client ID with credentials. Edit the config.ini file with this client name and point to the client secret.
  2. Set the credentials_filename to point to a location where authentication credentials will be stored. Note that the directory must already exist.
  3. Set the remote_filename for the name of the file when it is uploaded.
  4. Run python gdrive.py. If it is the first time, you will be asked to authenticate the app via a web page. If you have generated a new JSON secret, you must delete any old credentials saved to credentials_filename, otherwise it will not work. After running this script, it will generate a file ID for you to put in the config.ini file.

Flags

  • --storage: the storage system to use. Defaults to gdrive, but also accepts s3.
  • --config: location of the configuration file. Defaults to ~/.remote_storage_git/config.ini.
  • --verbose: prints additional information as the script runs

Examples

  • ./remote_storage_git.py init
  • ./remote_storage_git.py push
  • ./remote_storage_git.py pull
  • ./remote_storage_git.py destroy

Caveats

  1. There is no synchronization nor file update check. There are no guarantees what can happen if multiple writes to the same repository occur. Moreover, since some cloud based systems are eventually consistent, it is up to the user to wait a sufficient amount of time (after a write) before trying a read.

  2. The entire repository is uploaded / downloaded via this script. As such, it is meant for small repositories only.