PROJECTS

zhardlink

Introduction

There already exists too many different utilities for hardlinking identical files. The excuse for zhardlink is frustration of finding one that can handle situation where there would be more that 65000 (ext4) links pointing to single file. I tested few but when all eventually failed with EMLINK "Too many links", I decided to make my own.

Whole thing emerged from my need to compact very large cached OSM tile directory. There is very lot of blue water tiles in world.

Warnings

zhardlink is not throughly tested. It's status is "works for me", but on the other hand where it's used there's nothing to lose.

BEWARE OF DATA LOSS! It should not happen, nothing is removed but replaced with already successful temporary link. Contents are compared with sha256 that should not collide. But, at your own risk...

Install

Pick source from zhardlink-0.90.1.tar.gz. This is with plain simple Makefile with no configure. Omitted that, most are Linuxes nowadays.

pri:~$ tar -xzf ~/zhardlink-0.90.1.tar.gz 
pri:~$ cd zhardlink-0.90.1/
pri:~/zhardlink-0.90.1$ make
gcc -D_GNU_SOURCE -DVERSION="\"0.90.1\"" -O2 -std=c99 -c common.c -o common.o
gcc -D_GNU_SOURCE -DVERSION="\"0.90.1\"" -O2 -std=c99 -c link.c -o link.o
gcc -D_GNU_SOURCE -DVERSION="\"0.90.1\"" -O2 -std=c99 -c main.c -o main.o
gcc -D_GNU_SOURCE -DVERSION="\"0.90.1\"" -O2 -std=c99 -c scan.c -o scan.o
gcc -D_GNU_SOURCE -DVERSION="\"0.90.1\"" -O2 -std=c99 -c sha256.c -o sha256.o
gcc -O2 -std=c99 common.o link.o main.o scan.o sha256.o -s -o zhardlink
    

Examples

Help

pri:~/zhardlink-0.90.1$ ./zhardlink --help
Usage: zhardlink [OPTIONS] [directory]...
Hardlink identical files in given directories

OPTIONS

  -h, --help                 Print this message and exit
  -v, --verbose              Print extra messages and report hardlinked files.
...
    

Re-scanning with increasing max link count.

pri:~/zhardlink-0.90.1$ mkdir z-test
pri:~/zhardlink-0.90.1$ for a in $(seq 10 19); do cp /etc/passwd z-test/$a; done

pri:~/zhardlink-0.90.1$ ./zhardlink -c --max-links=2 z-test/
Scanning directory z-test...
Enumerating and hardlinking...
Total 5 files, 8 KiB linked (20480  disk space saved)

pri:~/zhardlink-0.90.1$ ls -li z-test/
total 40
2887840 -rw-r--r-- 2 me mygroup 1756 24. 1. 16:44 10
2887845 -rw-r--r-- 2 me mygroup 1756 24. 1. 16:44 11
2887842 -rw-r--r-- 2 me mygroup 1756 24. 1. 16:44 12
2887842 -rw-r--r-- 2 me mygroup 1756 24. 1. 16:44 13
2887844 -rw-r--r-- 2 me mygroup 1756 24. 1. 16:44 14
2887845 -rw-r--r-- 2 me mygroup 1756 24. 1. 16:44 15
2887844 -rw-r--r-- 2 me mygroup 1756 24. 1. 16:44 16
2887847 -rw-r--r-- 2 me mygroup 1756 24. 1. 16:44 17
2887847 -rw-r--r-- 2 me mygroup 1756 24. 1. 16:44 18
2887840 -rw-r--r-- 2 me mygroup 1756 24. 1. 16:44 19

pri:~/zhardlink-0.90.1$ ./zhardlink -c --max-links=5 z-test/
Scanning directory z-test...
Enumerating and hardlinking...
Total 7 files, 12 KiB linked (28672  disk space saved)
pri:~/zhardlink-0.90.1$ ls -li z-test/
total 40
2887840 -rw-r--r-- 5 me mygroup 1756 24. 1. 16:44 10
2887840 -rw-r--r-- 5 me mygroup 1756 24. 1. 16:44 11
2887840 -rw-r--r-- 5 me mygroup 1756 24. 1. 16:44 12
2887842 -rw-r--r-- 5 me mygroup 1756 24. 1. 16:44 13
2887842 -rw-r--r-- 5 me mygroup 1756 24. 1. 16:44 14
2887840 -rw-r--r-- 5 me mygroup 1756 24. 1. 16:44 15
2887842 -rw-r--r-- 5 me mygroup 1756 24. 1. 16:44 16
2887842 -rw-r--r-- 5 me mygroup 1756 24. 1. 16:44 17
2887842 -rw-r--r-- 5 me mygroup 1756 24. 1. 16:44 18
2887840 -rw-r--r-- 5 me mygroup 1756 24. 1. 16:44 19

pri:~/zhardlink-0.90.1$ ./zhardlink -c z-test/
Scanning directory z-test...
Enumerating and hardlinking...
Total 5 files, 8 KiB linked (20480  disk space saved)
pri:~/zhardlink-0.90.1$ ls -li z-test/
total 40
2887840 -rw-r--r-- 10 me mygroup 1756 24. 1. 16:44 10
2887840 -rw-r--r-- 10 me mygroup 1756 24. 1. 16:44 11
2887840 -rw-r--r-- 10 me mygroup 1756 24. 1. 16:44 12
2887840 -rw-r--r-- 10 me mygroup 1756 24. 1. 16:44 13
2887840 -rw-r--r-- 10 me mygroup 1756 24. 1. 16:44 14
2887840 -rw-r--r-- 10 me mygroup 1756 24. 1. 16:44 15
2887840 -rw-r--r-- 10 me mygroup 1756 24. 1. 16:44 16
2887840 -rw-r--r-- 10 me mygroup 1756 24. 1. 16:44 17
2887840 -rw-r--r-- 10 me mygroup 1756 24. 1. 16:44 18
2887840 -rw-r--r-- 10 me mygroup 1756 24. 1. 16:44 19
    

Filesystem limits: pathconf(_PC_LINK_MAX)

pri:~/zhardlink-0.90.1$ mkdir z-large
pri:~/zhardlink-0.90.1$ for a in $(seq 100000 200000); do echo '1' > z-large/$a; done

pri:~/zhardlink-0.90.1$ ./zhardlink -c z-large/
Scanning directory z-large...
Enumerating and hardlinking...
 Progress 100001/100001   (compared/total)
Total 99999 files, 195 KiB linked (390 MiB disk space saved)

pri:~/zhardlink-0.90.1$ ls -li z-large/ | head -10
total 400004
2950869 -rw-r--r-- 65000 me mygroup 2 24. 1. 16:57 100000
2950869 -rw-r--r-- 65000 me mygroup 2 24. 1. 16:57 100001
2950869 -rw-r--r-- 65000 me mygroup 2 24. 1. 16:57 100002
2902053 -rw-r--r-- 35001 me mygroup 2 24. 1. 16:57 100003
2902053 -rw-r--r-- 35001 me mygroup 2 24. 1. 16:57 100004
2950869 -rw-r--r-- 65000 me mygroup 2 24. 1. 16:57 100005
2950869 -rw-r--r-- 65000 me mygroup 2 24. 1. 16:57 100006
2950869 -rw-r--r-- 65000 me mygroup 2 24. 1. 16:57 100007
2950869 -rw-r--r-- 65000 me mygroup 2 24. 1. 16:57 100008
    

Miscellaneous

Contact

Bug, errors and suggestions: sub@nanona.fi