Unix & Linux Stack Exchange
Q&A for users of Linux, FreeBSD and other Unix-like operating systems
Latest Questions
1
votes
1
answers
160
views
mirror a directory tree by hard links for file contents and symlinks for directory structure
what is the best way to mirror an entire directory, say `original/`, to a new directory, say `mirror/`, which has the structure `mirror/data/` and `mirror/tree/`, such that - every file in the directory `original/` or in any of its subdirectories is hardlinked to a *file* in `mirror/data` - whose fi...
what is the best way to mirror an entire directory, say
original/
, to a new directory, say mirror/
, which has the structure mirror/data/
and mirror/tree/
, such that
- every file in the directory original/
or in any of its subdirectories is hardlinked to a *file* in mirror/data
- whose filename is a unique identifier of its content, say a hash of its content, and
- which is symlinked to from a point in mirror/tree
whose relative path corresponds to the relative path of the original file in original
,
such that it can be easily restored?
is this feature perhaps implemented by some tool in existence? – one that allows to flexibly choose the command for creating a unique identifier for a file by its content.
---
for instance, say there is only one file original/something
, which is a textfile containing the word “data”. then i want to run a script or command on original
, such that the result is:
$ tree original mirror
original
└── something
mirror
├── data
│ └── 6667b2d1aab6a00caa5aee5af8…
└── tree
└── original
└── something -> ../../data/6667b2d1aab6a00caa5aee5af8…
5 directories, 3 files
here, the file 667b…
is a hard link to original/something
and its filename is sha256sum hash of that file. note that i have abbreviated the filename for legibility.
i want to be able to perfectly restore the original by its mirror.
i know i can write a script to do that, but before i do that and maybe make a mistake and lose some data, i want to know if there is any tool out there that already implements this safely (i didn’t find any so far) or if there are any pitfalls.
*background*: i want to keep an archive of a directory that tracks renames, but i don't need versioning. i know that git-annex
can do that with a lot of overhead using git repositories, but i only need its way to mirror the contents of a directory using symlinks for the directory structure to files whose file names are hashes of their content. then i could use git-diff to track renames. i don't fully understand what git-annex is doing so i don't want to trust it with archiving my data. so i'm looking for a lighter alternative that is less intrusive.
windfish
(113 rep)
Feb 2, 2024, 12:39 PM
• Last activity: Feb 2, 2024, 09:51 PM
1
votes
1
answers
125
views
Git-annex auto merge symbolic links?
Is it possible to merge two same file automatically? For example, fileA and fileB are two same files. However, fileA is on PC and fileB is on Laptop. If I run `git annex import /path/to/fileA` and `git annex import /path/to/fileB` together on each device, it will remain two different symbolic links...
Is it possible to merge two same file automatically?
For example, fileA and fileB are two same files. However, fileA is on PC and fileB is on Laptop. If I run
git annex import /path/to/fileA
and git annex import /path/to/fileB
together on each device, it will remain two different symbolic links in git archive tree after running git annex sync
.
So, is there something like auto-merge tool that can remove one of those two symbolic link?
TJM
(574 rep)
Jan 22, 2017, 09:59 AM
• Last activity: Feb 25, 2021, 10:42 PM
1
votes
1
answers
364
views
accessing git-annex special remote from new repository
I'm using [`git-annex`](https://git-annex.branchable.com/) in `version 7.20190129` as it is provided on my `Debian Stable (Buster)` machine to keep big files under version control and have them distributed over multiple machines and drives. This works well as long as I have at least one "real" `git-...
I'm using [
git-annex
](https://git-annex.branchable.com/) in version 7.20190129
as it is provided on my Debian Stable (Buster)
machine to keep big files under version control and have them distributed over multiple machines and drives. This works well as long as I have at least one "real" git-annex
repository (not a [special remote
](https://git-annex.branchable.com/special_remotes/)) .
What I'd be interested in is using just one git annex
repository on my local machine and additionally special remotes
(e.g. the [bup
](https://bup.github.io/) [special remote
](https://git-annex.branchable.com/special_remotes/bup/) or the [rsync
](https://rsync.samba.org/) [special remote
](https://git-annex.branchable.com/special_remotes/rsync/) or, as soon as it lands on Debian Stable
, the [borg
](https://www.borgbackup.org/) [special remote
](https://git-annex.branchable.com/special_remotes/borg/)) .
My workflow is as follows:
cd /path/to/my/local/folder
git init
git annex init
git annex add myawesomefile
git commit -m 'this works on my local repository'
git annex initremote mybupbackuprepo type=bup encryption=none buprepo=/path/to/my/special/remote/location
git annex sync
git annex copy files --to mybupbackuprepo
Then I'm able to use my bup
special remote
as I would use an additional repository.
But now I'd like to access my bup
repo without using the first, local repo (e.g. in case my local machine would break down). As far as I understood (from following the [official guide
](https://git-annex.branchable.com/walkthrough/#index12h2) , the following should work:
cd /path/to/new/folder/to/extract/the/backup
git init
git annex init
git annex initremote mybupbackuprepo type=bup encryption=none buprepo=/path/to/my/special/remote
git annex enableremote mybupbackuprepo
git annex sync
But I'm still not able to see any files (or even some broken symlinks) and, obviously, also not able to get any of my data when using git annex sync --content
or git annex get myawesomefile
.
Any ideas? What am I missing?
n0542344
(416 rep)
Jan 6, 2021, 01:35 AM
• Last activity: Jan 13, 2021, 07:52 AM
1
votes
1
answers
218
views
git-annex created a monster that I cannot un-init
I was going to explore git-annex, but it turns out it doesn't do what I need. That would be fine except that I initialized it on a repository that I'm using (stupid, I see that now) and now all my files have been replaced with symlinks. I tried running `git-annex uninit` and it seemed like it was do...
I was going to explore git-annex, but it turns out it doesn't do what I need. That would be fine except that I initialized it on a repository that I'm using (stupid, I see that now) and now all my files have been replaced with symlinks.
I tried running
git-annex uninit
and it seemed like it was doing something:
unannex 2017/mapping/index.html ok
unannex 2017/mapping/slide_deck.md ok
unannex index.html ok
git-annex: Not fully uninitialized
Some annexed data is still left in .git/annex/objects/
This may include deleted files, or old versions of modified files.
But everything is still symlinks. What am I missing here? I even tried deleting the files in 2017/mapping
and copying new files over them but when I try to open them in Atom and edit them, when I save I get a permission error and when I go look I see that they're symlinks again. Even creating new files creates symlinks instead of files, which is confusing and frustrating.
How can I revert to where I was before all these symlinks?
I thought I'd solved the problem when I realized the git-annex daemon was still running, but I just went to commit my work and push and it is back.
Amanda
(1818 rep)
Nov 2, 2017, 11:34 PM
• Last activity: Apr 30, 2019, 10:40 AM
0
votes
2
answers
515
views
git annex - how to verify 2 repositories are exactly identical
How can I ensure that when I clone, sync, and get content from another git annex repository that I have setup an identical mirror? I have used a tool like unison in the past which did a file to file comparison, but that is time and memory intensive. Are there any other alternatives so I can perform...
How can I ensure that when I clone, sync, and get content from another git annex repository that I have setup an identical mirror?
I have used a tool like unison in the past which did a file to file comparison, but that is time and memory intensive.
Are there any other alternatives so I can perform a sanity check? The main motivation for this is that I just made a clone of an existing repository it is smaller. I expect it to be smaller because the old repository has unused or unreferenced objects, but it is quite a bit different in size.
So, I'd like to have some check I can run.
Walter
(1264 rep)
Oct 21, 2016, 08:52 PM
• Last activity: Feb 19, 2018, 01:19 PM
1
votes
0
answers
71
views
Git-annex link to different file names
Maybe this is just a crazy use case that doesn't work, but I was wondering if there's a way to build a file's history from files with different file names. I'm exploring this idea because I'd like to have a git-annex system but I can't force my coworkers to adapt. Here's what I have in mind : > Fold...
Maybe this is just a crazy use case that doesn't work, but I was wondering if there's a way to build a file's history from files with different file names. I'm exploring this idea because I'd like to have a git-annex system but I can't force my coworkers to adapt.
Here's what I have in mind :
> Folder 1, managed by coworkers (On a shared disk) :
>
- drawing_shop_12_nov_2015.pdf
- drawing_shop_13_nov_2015.pdf
- drawing_asbuilt_14_nov_2015.pdf
- drawing_asbuilt_rev1_15_nov_2015.pdf
And
> Git-annex, managed by me :
>
- drawing.pdf
>
(with a *shop* branch and a *asbuilt* branch)
The git-annex's
drawing.pdf
would have an history like this :
[shop]
|
Commit A "Initial shop drawing"
|
Commit B "Add corrections from Wizzbasket"
\
|
[asbuilt]
Commit C "Reflect as built"
|
Commit D "Change dweezelbox block for simplicity"
But somehow the "managed by coworkers" repo would be a direct mode repo with Commit A
pointing to drawing_shop_12_nov_2015.pdf
, Commit B
to drawing_shop_13_nov_2015.pdf
etc.
Can this be done?
malarkey
(11 rep)
Nov 24, 2016, 01:36 PM
1
votes
1
answers
69
views
"unknown transitions listed in local; upgrade git-annex!"
I'm not sure what is happening anymore. I disabled the assistant so I can set up what I wanted to exclude from certain repositories, after which I decided to just not use the assistant. Started working per usual with my repositories and now this error appears: unknown transitions listed in local; up...
I'm not sure what is happening anymore. I disabled the assistant so I can set up what I wanted to exclude from certain repositories, after which I decided to just not use the assistant. Started working per usual with my repositories and now this error appears:
unknown transitions listed in local; upgrade git-annex!
This appears whenever I use
enableremote
, fsck
, get
, move
, drop
, and others. I've checked the source of this line , but I don't know what it means.
Braiam
(36866 rep)
Sep 21, 2016, 02:34 AM
• Last activity: Sep 24, 2016, 01:22 PM
1
votes
1
answers
386
views
git annex sync - not pulling in files
I haven't synced my git annex repositories for quite some time and was trying to synchronize them recently, but am getting an error which indicates I need to merge. I am getting the error: `Updates were rejected because a pushed branch tip is behind its remote counterpart.` I tried to run `git annex...
I haven't synced my git annex repositories for quite some time and was trying to synchronize them recently, but am getting an error which indicates I need to merge.
I am getting the error:
Updates were rejected because a pushed branch tip is behind its remote counterpart.
I tried to run git annex merge
and git pull origin master
, but git complains that I need to run it in a work tree.
[EDIT] I fixed that issue I believe, as my shell command was set to git-shell. Once I switched to git-annex-shell, it seemed to fix that issue.
Now, I noticed that I have some files missing even after I have done a sync and a get. I added both repositories to the other so repository A has remote B and repository B has remote A.
However, even after attempting to sync and get several times, I see no changes. Those files are still missing in one repository.
Walter
(1264 rep)
May 1, 2016, 02:03 AM
• Last activity: May 1, 2016, 03:30 PM
6
votes
2
answers
711
views
Init gix-annex additional repo with existing files
I configured git-annex to keep track of a directory containing several GB of data. Its content is replicated on an S3 remote, so I can drop some files to free some space and get them back when I need them. I also have another computer where I would like to do the same thing. This other computer alre...
I configured git-annex to keep track of a directory containing several GB of data. Its content is replicated on an S3 remote, so I can drop some files to free some space and get them back when I need them.
I also have another computer where I would like to do the same thing. This other computer already contains most of the files that are stored on the S3 remote.
How can I tell git-annex to init a new repository on this other computer without downloading from S3 the files that it can find in the local directory?
gioele
(2329 rep)
May 12, 2013, 08:00 PM
• Last activity: Apr 30, 2015, 05:11 PM
3
votes
0
answers
88
views
Why doesn't git-annex map realize Einstein is one repository?
TLDR -- Have I confused [git-annex][1] into thinking one machine is actually two? Background -- I have a [git-annex][1] repository that has copies on three machines: Watt, Einstein, and Heisenberg in addition to a [special remote][3] on S3. Einstein is a server, and has both external public IPs and...
TLDR
--
Have I confused git-annex into thinking one machine is actually two?
Background
--
I have a git-annex repository that has copies on three machines: Watt, Einstein, and Heisenberg in addition to a special remote on S3.
Einstein is a server, and has both external public IPs and internal private (RFC1918) ones. Watt is on the LAN, and uses Einstein's LAN IP. Heisenberg is a laptop, so uses one of its public IPs (so it can still sync even when remote).
When I run
git-annex map showing two Einsteins" class="img-fluid rounded" style="max-width: 100%; height: auto; margin: 10px 0;" loading="lazy">
(Note that HeisenbergW5 is one of Heisenberg's host names, it has multiple interfaces...)
The Question
-
That looks awfully like git-annex doesn't realize the Einstein that Heisenberg is sync'ing with is the same as the one Watt is sync'ing with (but, oddly, only in one direction). Do I need to worry about this, or is it just a minor issue with
git-annex map
on Watt, this is what I get:

git-annex map
?
derobert
(112979 rep)
Dec 3, 2014, 06:35 PM
Showing page 1 of 10 total questions