sis-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Martin Desruisseaux (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SIS-422) Migrate from SVN to Git as the main SIS code repository
Date Fri, 22 Jun 2018 12:01:00 GMT

    [ https://issues.apache.org/jira/browse/SIS-422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520284#comment-16520284
] 

Martin Desruisseaux commented on SIS-422:
-----------------------------------------

Cleaning the history by removing the above-cited files reduced the {{.git}} directory size
from 35 Mb to 27 Mb.

> Migrate from SVN to Git as the main SIS code repository
> -------------------------------------------------------
>
>                 Key: SIS-422
>                 URL: https://issues.apache.org/jira/browse/SIS-422
>             Project: Spatial Information Systems
>          Issue Type: Improvement
>            Reporter: Martin Desruisseaux
>            Assignee: Martin Desruisseaux
>            Priority: Major
>             Fix For: 1.0
>
>
> Migrate to git as the main source code repository. After this work:
> * The source code repository will become https://gitbox.apache.org/repos/asf/sis.
> * The https://svn.apache.org/repos/asf/sis/trunk/ repository will become read-only.
> We will continue to use Subversion for the {{site}}, {{sis-data}} and {{non-free}}. Before
to make the new git repository ready for use, we will try to cleanup its history by removing
large files, especially:
> * {{California_Restaurants.csv}} (19 Mb)
> * {{DEPARTEMENT.SHP}} (3 Mb)
> * {{ANC90Ply_4326.shp}} (0.7 Mb)
> Those large files were identified as below (source: [stackoverflow|https://stackoverflow.com/questions/10622179/how-to-find-identify-large-files-commits-in-git-history]):
> {code:Bash}
> git rev-list --objects --all | sort -k 2 > allfileshas.txt
> git gc && git verify-pack -v .git/objects/pack/pack-*.idx | egrep "^\w+ blob\W+[0-9]+
[0-9]+ [0-9]+$" | sort -k 3 -n -r > bigobjects.txt
> for SHA in `cut -f 1 -d\  < bigobjects.txt`; do
> echo $(grep $SHA bigobjects.txt) $(grep $SHA allfileshas.txt) | awk '{print $1,$3,$7}'
>> bigtosmall.txt
> done;
> {code}
> Commands executed for removing them:
> {code:Bash}
> git filter-branch --tree-filter 'find . -name "California_Restaurants.csv" -delete' --
--all
> git filter-branch --tree-filter 'find . -name "DEPARTEMENT.*" -delete' -- --all
> git filter-branch --tree-filter 'find . -name "ANC90Ply_4326*" -delete' -- --all
> git filter-branch --tree-filter 'find . -name "*~" -delete' -- --all
> git filter-branch --tree-filter 'rm -rf "sis-data"' -- --all
> git update-ref -d refs/original/refs/heads/master
> git reflog expire --expire=now --all
> git gc --prune=now
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message