Skip to content

Database support #134

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
a18090 opened this issue Apr 20, 2024 · 29 comments
Open

Database support #134

a18090 opened this issue Apr 20, 2024 · 29 comments

Comments

@a18090
Copy link

a18090 commented Apr 20, 2024

Many thanks for providing this project,

I encountered a very difficult problem when using it, that is, the amount of data is too large, about 600,000 images and videos, and the local database file frequently presses CTRL + C. After re-running, it will prompt me that the database is damaged.
This is a nightmare, I'm wondering if I can put the database into mysql or MariaDB so that when I rerun it it won't affect my data.
I judged that I might have been interrupted while he was writing, so this problem occurred, but I have so many images that it may take a week to process each time.

Thanks again

@xemle
Copy link
Owner

xemle commented Apr 20, 2024

Hi @a18090

Thank you for using HomeGallery and sharing your issues here. And congratulations you are the first person I know who is using 600k media files. That is incredible! I am using only 100k and a friend is using 400k. So you are leading the list!

So you report that your database is corrupt? You can rebuild your database via the cli running the importer via ./gallery.js run import. The exact call or the cli depends on your platform.

The rebuild will rescan your files, extract all meta files and create previews (and skipping it if they already exist) and is rebuilding the database. See internals of the doc for details of the building blocks.

Since I do not have experience with 600k that would be the answer for the theory. If it will work on your machine I do not know.

Regarding another database backend like mysql etc: The design decision for the database was to not using a standard RDBMS database but load the whole information into the browser to have a snappy user experience. This decision comes with a tradeoff: It can not scale endless. So I am curious how your experience is with 600k media? I would appreciate if you share this information:

  • How is your general setup (OS or docker)
  • Your main work flow (desktop or mobile)
  • How is the loading time of your database?
  • General user experience doing searches in the UI (are the result shown "fast" enough)?

Hope that helps

@a18090
Copy link
Author

a18090 commented Apr 21, 2024

Hi @xemle

1. How is your general setup (OS or docker)
I have tried docker, but there are some problems, because my network traffic will be fragmented when using wireguard in my environment, so I used OS direct deployment (binary), I used SSD for caching, and images and videos are in HDD.
Among them, I tried defining the video as h265 encoding in the conf file (trying to reduce the video file size), because my source file is about 4T, and the converted file is about 1.6T. When indexing such a large number of files, the CPU and The memory consumption is not too high, but the ffmpeg CPU usage is very high during video conversion.

2.Your main work flow (desktop or mobile)
My Android can still handle the "Year" page when the data volume exceeds 200,000. After that, I can only enter the website through "tags". On the desktop, this problem occurs when the data volume is around 400,000, and I cannot enter the "Year" page. It can be entered through "tags". I have tried Apple but it is good and bad. I can't judge.
If you directly enter the "Year" tab in the desktop environment, Chrome will occupy more than 2G of RAM and will be stuck for about 20 minutes and cannot be operated. However, it is normal to enter "tags" and load the data before entering "Year".
Android: 8G RAM + Snapdragon 8G3 + Chrome
Desktop: 16G RAM + Intel 1260P + Chrome
Apple: iPhone 13Pro + Chrome

3.How is the loading time of your database?
My database is on an SSD and loads very quickly, probably taking less than a minute before starting to process new media as my data keeps growing.

4.General user experience doing searches in the UI (are the result shown "fast" enough)?
The search can basically be processed at the second level. Most of the time, I am indifferent. For example, similar face searches and file name searches are almost similar.
I remember that my database file was about 200M when it was damaged, and the index was about 200-400M.

Regarding video conversion, I am wondering if I can try to load the source file directly if it is a LAN environment. The program can only process the exif data.
Because I found that most videos are h264 or newer, Chrome should be able to play them directly (there may be audio playback issues).

The main problems encountered can be referred to as follows:
1.
The API server may disappear suddenly, but I don't know when it crashes. I am running it in a docker environment.
"I guess this may have something to do with my concurrency being too high."
2.
When I entered the website through a non-"tags" page (I added all EXIFs containing GPS to tag: geo), I found that the map could not load all the data properly, and sometimes it might be missing.
I try to do it regularly search/latitude%20in%20[0.0000:90.0000]%20and%20longitude%20in%20[90.0000:180.0000],Then manually add tags:geo
3.
Usually I execute it directly through ./home-gallery -c home-gallery.conf, and then "start" directly.

Thanks again xemle, I don't know JavaScript, I tried AI to help me with database support, but I found it ridiculous since I only know Python programming.

@xemle
Copy link
Owner

xemle commented Apr 22, 2024

Hi @a18090

thank you for the detailed answer. Were you able to fix your database issue?

2.Your main work flow (desktop or mobile) My Android can still handle the "Year" page when the data volume exceeds 200,000. After that, I can only enter the website through "tags". On the desktop, this problem occurs when the data volume is around 400,000, and I cannot enter the "Year" page. It can be entered through "tags". I have tried Apple but it is good and bad. I can't judge. If you directly enter the "Year" tab in the desktop environment, Chrome will occupy more than 2G of RAM and will be stuck for about 20 minutes and cannot be operated. However, it is normal to enter "tags" and load the data before entering "Year". Android: 8G RAM + Snapdragon 8G3 + Chrome Desktop: 16G RAM + Intel 1260P + Chrome Apple: iPhone 13Pro + Chrome

I am surprised that the tag image works but others not. It needs some investigation why it is partially working.

Regarding video conversion, I am wondering if I can try to load the source file directly if it is a LAN environment. The program can only process the exif data. Because I found that most videos are h264 or newer, Chrome should be able to play them directly (there may be audio playback issues).

Currently serving original files such as videos are not supported even if the browser could playback the videos. This issue has been addressed several times like #96, #25. Since I develop this gallery for my needs I do have more old non native supported video formats for the browser and like more to separate the original files from the gallery files.

Maybe some day there will be a plugin system where it will be easy to extend and customize this functionality.

The main problems encountered can be referred to as follows:

  1. The API server may disappear suddenly, but I don't know when it crashes. I am running it in a docker environment. "I guess this may have something to do with my concurrency being too high."

Please try to reduce the extractor.apiServer.concurrent setting to 1 if you face problems.

  1. When I entered the website through a non-"tags" page (I added all EXIFs containing GPS to tag: geo), I found that the map could not load all the data properly, and sometimes it might be missing. I try to do it regularly search/latitude%20in%20[0.0000:90.0000]%20and%20longitude%20in%20[90.0000:180.0000],Then manually add tags:geo

There is a search term for it: exists(geo) which matches only media with geo information. Or not exists(geo) for media without geo information.

  1. Usually I execute it directly through ./home-gallery -c home-gallery.conf, and then "start" directly.

Have you tried to run ./home-gallery -c home-gallery.conf run server? It does not make a difference but it will just start the server.

Thanks again xemle, I don't know JavaScript, I tried AI to help me with database support, but I found it ridiculous since I only know Python programming.

So what is your biggest pain currently with HomeGallery? Can you work with it? Did you tried other self hosted galleries? How do they perform on you large media set? What features do you prefer in HomeGallery? Which features do you like in others?

@a18090
Copy link
Author

a18090 commented Apr 23, 2024

Hi @xemle

My database doesn't seem to have been repaired, but it's not a serious problem, I'll probably exclude the video files next time I rebuild so it works faster (since I have almost 110,000 videos) haha.

I am curious about the two entry methods of "year" and "tags", but I will try these two pages again the next time I re-import, and then I will take a look at the log and chrome responses.

The video problem is not very serious. After all, I will reduce the bit rate and preset when converting to increase the conversion speed.

I'm going to try reducing api.server.concurrent to 1 and test again.

I like HomeGallery's face search and similar image search functions. These two functions are very helpful. I will try to restart HomeGallery again in the next days. This time I may regularly back up the database file to reduce the risk.

I've tried other apps and there are some issues.
Photoprism: It is very smooth when managing a many files. The cache occupies about ~180G. The video processing method "transcode on click", but his face recognition accuracy is a big problem (For me), and his The database has a file size of 4G and I expect this will increase in the future.

I feel like there may be a database issue on several of the self-hosted galleries I've used.
When using photoprism, I reconstructed the database nearly three times before I found the solution, because when enabled with docker-compose, each Ctrl+C interruption may cause the database to be damaged at any time, and it is irreversible.
This is perhaps my biggest problem with self-hosted galleries, but I'm also trying to find various ways to solve it, such as regularly backing up the database file or minimizing the convenience of Ctrl+C.

@a18090
Copy link
Author

a18090 commented May 5, 2024

{"level":20,"time":"2024-05-04T17:36:54.060Z","pid":1358080,"hostname":"cdn","module":"stream.pipe","levelName":"debug","duration":41646,"msg":"Processed entry dir cache 211 (+1)"} {"level":20,"time":"2024-05-04T17:36:54.060Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1589.9 MB, heapTotal: 1067.0 MB, heapUsed: 986.9 MB, external: 42.8 MB, arrayBuffers: 40.6 MB"} {"level":20,"time":"2024-05-04T17:37:11.586Z","pid":1358080,"hostname":"cdn","module":"stream.pipe","levelName":"debug","duration":17526,"msg":"Processed entry dir cache 214 (+3)"} {"level":20,"time":"2024-05-04T17:37:21.964Z","pid":1358080,"hostname":"cdn","module":"stream.pipe","levelName":"debug","duration":10378,"msg":"Processed entry dir cache 215 (+1)"} {"level":20,"time":"2024-05-04T17:37:36.514Z","pid":1358080,"hostname":"cdn","module":"stream.pipe","levelName":"debug","duration":14551,"msg":"Processed entry dir cache 216 (+1)"} {"level":20,"time":"2024-05-04T17:37:36.514Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 2731.6 MB, heapTotal: 2087.8 MB, heapUsed: 2030.5 MB, external: 153.8 MB, arrayBuffers: 151.6 MB"} {"level":20,"time":"2024-05-04T17:37:56.699Z","pid":1358080,"hostname":"cdn","module":"stream.pipe","levelName":"debug","duration":20185,"msg":"Processed entry dir cache 217 (+1)"} {"level":20,"time":"2024-05-04T17:38:13.818Z","pid":1358080,"hostname":"cdn","module":"stream.pipe","levelName":"debug","duration":17119,"msg":"Processed entry dir cache 219 (+2)"} {"level":20,"time":"2024-05-04T17:38:13.818Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 3906.7 MB, heapTotal: 3241.7 MB, heapUsed: 3163.1 MB, external: 90.8 MB, arrayBuffers: 88.6 MB"} {"level":20,"time":"2024-05-04T17:38:31.090Z","pid":1358080,"hostname":"cdn","module":"stream.pipe","levelName":"debug","duration":17272,"msg":"Processed entry dir cache 220 (+1)"} {"level":20,"time":"2024-05-04T17:39:10.659Z","pid":1358080,"hostname":"cdn","module":"stream.pipe","levelName":"debug","duration":39569,"msg":"Processed entry dir cache 226 (+6)"} {"level":20,"time":"2024-05-04T17:39:10.659Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 4555.3 MB, heapTotal: 3800.8 MB, heapUsed: 3705.3 MB, external: 185.8 MB, arrayBuffers: 183.6 MB"} {"level":30,"time":"2024-05-04T17:39:10.660Z","pid":1358080,"hostname":"cdn","module":"extractor","levelName":"info","duration":144649,"msg":"Processed 146217 entries (#146217, +23441, processing 115142 and queued 62 entries)"} {"level":20,"time":"2024-05-04T17:39:35.163Z","pid":1358080,"hostname":"cdn","module":"stream.pipe","levelName":"debug","duration":24504,"msg":"Processed entry dir cache 227 (+1)"} {"level":20,"time":"2024-05-04T17:39:51.912Z","pid":1357985,"hostname":"cdn","module":"cli.spawn","levelName":"debug","spawn":{"env":{"GALLERY_LOG_LEVEL":"trace","GALLERY_LOG_JSON_FORMAT":"true"},"command":"/tmp/caxa/home-gallery/master/home-gallery/node/bin/node","args":["/tmp/caxa/home-gallery/master/home-gallery/gallery.js","extract","--index","/data/ssd/glh/config/tdl.idx"],"pid":1358080,"code":null,"signal":"SIGABRT","cmd":"GALLERY_LOG_LEVEL=trace GALLERY_LOG_JSON_FORMAT=true /tmp/caxa/home-gallery/master/home-gallery/node/bin/node /tmp/caxa/home-gallery/master/home-gallery/gallery.js extract --index /data/ssd/glh/config/tdl.idx"},"duration":29365830,"msg":"Executed cmd GALLERY_LOG_LEVEL=trace GALLERY_LOG_JSON_FORMAT=true /tmp/caxa/home-gallery/master/home-gallery/node/bin/node /tmp/caxa/home-gallery/master/home-gallery/gallery.js extract --index /data/ssd/glh/config/tdl.idx"} {"level":30,"time":"2024-05-04T17:39:51.913Z","pid":1357985,"hostname":"cdn","module":"cli.spawn","levelName":"info","msg":"Cli extract --index /data/ssd/glh/config/tdl.idx exited by signal SIGABRT"}

Hi @xemle

I was rebuilding the database and I tested directly
./gallery -c gallery.config.yml run
But the data was incomplete. The webpage only showed ~270,000 images (I excluded videos). Later I tried re-indexing the small files, but it reported an error. It seemed that the memory overflowed?

The server memory has 64G, and 18G is currently used.

@xemle
Copy link
Owner

xemle commented May 6, 2024

Hi @a18090

Thank you for your report.

From the logs I can not say a lot. The rebuild command ./gallery -c gallery.config.yml run misses the command. Are you running server or import?

I can see that the heapUsed is a lot with about 4GB (heapUsed: 3705.3 MB). To be honest by algorithm I do not know why the heap consumption is so high (also on my side). My assumption is that the memory consumption of the database creation step should not that high since the end product are only a few 100 MB but at the end it is. I was able to fix the issue by increasing the --max-old-space-size node option for the database creation. This might not be enough on your side...

You can try to set 8 GB in your gallery config for the database creation. Maybe it helps

database:
  maxMemory: 8192

To fix your side I need further investigation of node's memory management and need to check what can be improved to keep the memory consumption low.

@a18090
Copy link
Author

a18090 commented May 8, 2024

Hi @xemle

thank you for your reply,
I have configured it in the settings file "gallery.config.yml"

database:
   file: '{configDir}/database.db'
   maxMemory: 40960

It seems to be a problem during operation?

I try cat gallery.log |grep heapTotal

During operation, the cached data enters the memory directly and is not written to database.db.

{"level":20,"time":"2024-05-04T16:56:10.906Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1569.5 MB, heapTotal: 1077.9 MB, heapUsed: 1036.1 MB, external: 27.6 MB, arrayBuffers: 25.4 MB"}
{"level":20,"time":"2024-05-04T16:57:52.387Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1461.6 MB, heapTotal: 968.8 MB, heapUsed: 930.8 MB, external: 28.7 MB, arrayBuffers: 26.5 MB"}
{"level":20,"time":"2024-05-04T16:58:33.444Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1591.6 MB, heapTotal: 1098.9 MB, heapUsed: 1057.1 MB, external: 28.3 MB, arrayBuffers: 26.1 MB"}
{"level":20,"time":"2024-05-04T16:59:40.639Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1567.6 MB, heapTotal: 1074.5 MB, heapUsed: 1032.9 MB, external: 27.0 MB, arrayBuffers: 24.8 MB"}
{"level":20,"time":"2024-05-04T17:02:46.419Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1594.4 MB, heapTotal: 1100.1 MB, heapUsed: 1060.8 MB, external: 28.4 MB, arrayBuffers: 26.2 MB"}
{"level":20,"time":"2024-05-04T17:05:10.364Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1516.5 MB, heapTotal: 1021.8 MB, heapUsed: 970.3 MB, external: 20.4 MB, arrayBuffers: 18.2 MB"}
{"level":20,"time":"2024-05-04T17:09:11.229Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1533.8 MB, heapTotal: 1037.5 MB, heapUsed: 986.1 MB, external: 21.8 MB, arrayBuffers: 19.6 MB"}
{"level":20,"time":"2024-05-04T17:13:21.719Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1517.0 MB, heapTotal: 1020.7 MB, heapUsed: 974.6 MB, external: 21.3 MB, arrayBuffers: 19.1 MB"}
{"level":20,"time":"2024-05-04T17:16:34.564Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1623.2 MB, heapTotal: 1126.8 MB, heapUsed: 1073.8 MB, external: 22.5 MB, arrayBuffers: 20.3 MB"}
{"level":20,"time":"2024-05-04T17:17:10.270Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1488.8 MB, heapTotal: 991.7 MB, heapUsed: 943.5 MB, external: 20.1 MB, arrayBuffers: 17.9 MB"}
{"level":20,"time":"2024-05-04T17:17:54.771Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1625.5 MB, heapTotal: 1128.3 MB, heapUsed: 1080.7 MB, external: 22.7 MB, arrayBuffers: 20.5 MB"}
{"level":20,"time":"2024-05-04T17:18:44.165Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1542.6 MB, heapTotal: 1045.3 MB, heapUsed: 1005.2 MB, external: 28.1 MB, arrayBuffers: 25.9 MB"}
{"level":20,"time":"2024-05-04T17:19:59.286Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1537.4 MB, heapTotal: 1040.1 MB, heapUsed: 999.7 MB, external: 27.7 MB, arrayBuffers: 25.5 MB"}
{"level":20,"time":"2024-05-04T17:21:13.734Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1559.5 MB, heapTotal: 1062.2 MB, heapUsed: 1018.5 MB, external: 24.9 MB, arrayBuffers: 22.7 MB"}
{"level":20,"time":"2024-05-04T17:22:55.906Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1451.1 MB, heapTotal: 953.5 MB, heapUsed: 884.3 MB, external: 20.6 MB, arrayBuffers: 18.4 MB"}
{"level":20,"time":"2024-05-04T17:27:09.221Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1508.3 MB, heapTotal: 1012.0 MB, heapUsed: 966.9 MB, external: 22.0 MB, arrayBuffers: 19.8 MB"}
{"level":20,"time":"2024-05-04T17:30:00.773Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1571.2 MB, heapTotal: 1057.0 MB, heapUsed: 1005.2 MB, external: 20.7 MB, arrayBuffers: 18.5 MB"}
{"level":20,"time":"2024-05-04T17:31:24.513Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1557.7 MB, heapTotal: 1042.8 MB, heapUsed: 946.9 MB, external: 20.2 MB, arrayBuffers: 18.0 MB"}
{"level":20,"time":"2024-05-04T17:32:22.989Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1719.9 MB, heapTotal: 1204.9 MB, heapUsed: 1158.5 MB, external: 25.1 MB, arrayBuffers: 22.9 MB"}
{"level":20,"time":"2024-05-04T17:34:29.942Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1503.8 MB, heapTotal: 1024.0 MB, heapUsed: 948.0 MB, external: 21.9 MB, arrayBuffers: 19.6 MB"}
{"level":20,"time":"2024-05-04T17:36:12.414Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1748.3 MB, heapTotal: 1223.9 MB, heapUsed: 1195.1 MB, external: 21.2 MB, arrayBuffers: 19.0 MB"}
{"level":20,"time":"2024-05-04T17:36:54.060Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 1589.9 MB, heapTotal: 1067.0 MB, heapUsed: 986.9 MB, external: 42.8 MB, arrayBuffers: 40.6 MB"}
{"level":20,"time":"2024-05-04T17:37:36.514Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 2731.6 MB, heapTotal: 2087.8 MB, heapUsed: 2030.5 MB, external: 153.8 MB, arrayBuffers: 151.6 MB"}
{"level":20,"time":"2024-05-04T17:38:13.818Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 3906.7 MB, heapTotal: 3241.7 MB, heapUsed: 3163.1 MB, external: 90.8 MB, arrayBuffers: 88.6 MB"}
{"level":20,"time":"2024-05-04T17:39:10.659Z","pid":1358080,"hostname":"cdn","module":"stream.memory","levelName":"debug","msg":"Used memory: rss: 4555.3 MB, heapTotal: 3800.8 MB, heapUsed: 3705.3 MB, external: 185.8 MB, arrayBuffers: 183.6 MB"}

gallery.config.yml

configDir: '/data/ssd/glh/config'
cacheDir: '/data/ssd/glh/cache'

sources:
  - dir: '/data/hdd/tdl'
    excludes:
      - '*.mp4'
      - '*.m4a'
      - '*.mov'
    index: '{configDir}/{basename(dir)}.idx'
    maxFilesize: 20M

extractor:
  apiServer:
    url: http://127.0.0.1:3001
    timeout: 30
    concurrent: 1
  geoReverse:
    url: https://nominatim.openstreetmap.org
  useNative:
    - ffprobe
    - ffmpeg

storage:
  dir: '{cacheDir}'

database:
  file: '{configDir}/database.db'
  maxMemory: 40960
events:
  file: '{configDir}/events.db'

server:
  port: 3000
  host: '0.0.0.0'
  openBrowser: false
  auth:
    users:
      - xxx: '{SHA}xxxx'
  basePath: /
  watchSources: true

logger:
  - type: console
    level: info
  - type: file
    level: debug
    file: '/data/ssd/glh/config/gallery.log'

@xemle
Copy link
Owner

xemle commented May 13, 2024

Hi @a18090

Thank you for your logs. It seams that the memory consumption on the heap grows to much at the end of your logs.

As I wrote earlier: I can not explain currently why the high consumption is required. Currently the time to investigate for such case is also very limited since I am using my spare time to work on the plugin system to open the gallery to custom functions.

I am very sorry but I can not help here currently. I keep it in my mind because it bugs me that the consumption is that high while in theory it should be slim.

@xemle
Copy link
Owner

xemle commented Jun 30, 2024

I did some analysis regarding memory consumption. IMHO I was not able to detect a memory leak or greater memory issue. Except that the database building process needs much memory since it loads all the data into memory.

My database with 100k requires 200 MB uncompressed JSON data and the database creation succeeds with 750 MB heap size.

The process can be optimized in a stream way which should require less memory since it does not need to load the whole database into the memory while creating the database. This would enable the building of larger galleries with more than 400k images on less memory.

The current workaround is to provide more heap space to the database building process.

Still on the server and client side the gallery needs to load the database into the memory but I guess it is more efficient since it is read only.

@a18090 Is this memory issue still relevant for you?

@a18090
Copy link
Author

a18090 commented Jul 10, 2024

Hi @xemle
First of all thank you for your efforts

Memory is not very critical for me, because my server has 128G memory, and computers and mobile phones generally have 12-16G. Most of the time it is fine. When using Chrome to access it on the PC, I noticed that the stack will be very large, but I recently encountered a problem that prevented me from using home-gallery recently.

My database.db will crash directly after running for a while, causing database.db to be unable to read, as shown below

(base) root@cdn:/data/ssd/glh/config# ls -la
total 1181468
drwxr-xr-x 1 root root       204 Jun 20 16:11 .
drwxr-xr-x 1 root root       102 Jun 18 06:12 ..
-rw-r--r-- 1 root root 140766901 Jun 20 14:19 database.db
-rw-r--r-- 1 root root 136943966 Jun  1 07:48 db.bak
-rw-r--r-- 1 root root     38402 May  8 15:12 dump.log
-rw-r--r-- 1 root root      4890 Apr 29 01:09 events.db
-rw-r--r-- 1 root root 889740155 Jun 20 16:09 gallery.log
-rw-r--r-- 1 root root  42275624 Jun 20 16:11 tdl.idx
-rw-r--r-- 1 root root     20407 Jun 18 06:01 tdl.idx.0618-3XYu.journal
-rw-r--r-- 1 root root     18330 Jun 20 16:11 tdl.idx.0620-JHz2.journal

(base) root@cdn:/data/ssd/glh/config# file database.db
database.db: gzip compressed data, from Unix, original size modulo 2^32 537025063

(base) root@cdn:/data/ssd/glh/config# file db.bak
db.bak: gzip compressed data, from Unix, original size modulo 2^32 521730731 gzip compressed data, unknown method, ASCII, extra field, has comment, encrypted, from FAT filesystem (MS-DOS, OS/2, NT), original size modulo 2^32 521730731

db.bak is the database.db I backed up some time ago.
I have tried to restore it to database.db, but I found that it will continue to crash and report errors after running for a while.

(base) root@cdn:/data/ssd/glh# ./gallery -c gallery.config.yml
✔ Gallery main menu · server
[2024-07-10 12:55:47.553]: cli.spawn Run cli with server
[2024-07-10 12:55:47.863]: server.auth Set basic auth for users: xxx
[2024-07-10 12:55:47.867]: server Your own Home Gallery is running at http://localhost:3000
[2024-07-10 12:55:47.868]: server.cli Run cli with run import --initial --update --watch...
[2024-07-10 12:55:48.064]: cli.run Import online sources: /data/hdd/tdl
[2024-07-10 12:55:48.064]: cli.task.import Run import in watch mode. Start watching source dirs for file changes: /data/hdd/
/tmp/caxa/home-gallery/master/home-gallery/node_modules/@home-gallery/server/dist/api/database/read-database.js:78
        return cb(err);
               ^

TypeError: cb is not a function
    at /tmp/caxa/home-gallery/master/home-gallery/node_modules/@home-gallery/server/dist/api/database/read-database.js:78:16
    at /tmp/caxa/home-gallery/master/home-gallery/node_modules/@home-gallery/server/dist/api/database/read-database.js:13:14
    at /tmp/caxa/home-gallery/master/home-gallery/node_modules/@home-gallery/database/dist/database/read-database.js:9:14
    at /tmp/caxa/home-gallery/master/home-gallery/node_modules/@home-gallery/common/dist/fs/read-json-gzip.js:14:12
    at node:internal/util:519:12
    at process.processTicksAndRejections (node:internal/process/task_queues:82:21)

Node.js v20.13.1
Error: Error: Server exited with code 1 and signal null

I will resume running the test again after the update, and I will clear the log and pay attention to

@xemle
Copy link
Owner

xemle commented Jul 10, 2024

Hi @a18090

Thank you for the update. I guess you identified an bug and it seems that the bug is quite old and my tests were not covering it. I will provide an fix in the next days. So the error cb is not a function should vanish than and you can retest

@a18090
Copy link
Author

a18090 commented Jul 12, 2024

Hi @xemle

I found an interesting situation when I ran it again.

When I open the browser with (Laptop) Chrome, if I enter the year tab, it will enter a waiting state if it is waiting for loading.

This is the command output log.

Connected to server events
event-store.ts:44 Applied 13 events and updated 0 entries in 0ms
ApiService.ts:127 Syncing database entries with offline database...
ApiService.ts:119 Found 1 from 1 trees out of sync in 1ms
ApiService.ts:121 Fetching 1 missing tree objects
ApiService.ts:95 Fetched 1 trees and 0 entries for offline database
ApiService.ts:95 Fetched 40 trees and 0 entries for offline database
ApiService.ts:95 Fetched 50 trees and 265 entries for offline database
search-store.ts:54 update query to 1 entries from 1 entries
ApiService.ts:95 Fetched 25 trees and 6244 entries for offline database
search-store.ts:54 update query to 261 entries from 261 entries
ApiService.ts:95 Fetched 57 trees and 9789 entries for offline database
search-store.ts:54 update query to 6473 entries from 6473 entries
ApiService.ts:95 Fetched 21 trees and 6915 entries for offline database
search-store.ts:54 update query to 13170 entries from 13170 entries
ApiService.ts:95 Fetched 0 trees and 9800 entries for offline database
search-store.ts:54 update query to 19572 entries from 19572 entries
ApiService.ts:95 Fetched 0 trees and 9323 entries for offline database
search-store.ts:54 update query to 28157 entries from 28157 entries
ApiService.ts:95 Fetched 0 trees and 1408 entries for offline database
ApiService.ts:95 Fetched 0 trees and 1456 entries for offline database
ApiService.ts:95 Fetched 0 trees and 134 entries for offline database
search-store.ts:54 update query to 39129 entries from 39129 entries
search-store.ts:54 update query to 45404 entries from 45404 entries
search-store.ts:54 update query to 48686 entries from 48686 entries
search-store.ts:54 update query to 70302 entries from 70302 entries
search-store.ts:54 update query to 80169 entries from 80169 entries
search-store.ts:54 update query to 113196 entries from 113196 entries
search-store.ts:54 update query to 124086 entries from 124086 entries
search-store.ts:54 update query to 138204 entries from 138204 entries
search-store.ts:54 update query to 153505 entries from 153505 entries
search-store.ts:54 update query to 158739 entries from 158739 entries
search-store.ts:54 update query to 169621 entries from 169621 entries
search-store.ts:54 update query to 178623 entries from 178623 entries
search-store.ts:54 update query to 186852 entries from 186852 entries
search-store.ts:54 update query to 195096 entries from 195096 entries
search-store.ts:54 update query to 203734 entries from 203734 entries
search-store.ts:54 update query to 213179 entries from 213179 entries
search-store.ts:54 update query to 218025 entries from 218025 entries
search-store.ts:54 update query to 224955 entries from 224955 entries
search-store.ts:54 update query to 227904 entries from 227904 entries
search-store.ts:54 update query to 231714 entries from 231714 entries
search-store.ts:54 update query to 234353 entries from 234353 entries
search-store.ts:54 update query to 236388 entries from 236388 entries
search-store.ts:54 update query to 239641 entries from 239641 entries
search-store.ts:54 update query to 241315 entries from 241315 entries
search-store.ts:54 update query to 243367 entries from 243367 entries
search-store.ts:54 update query to 246433 entries from 246433 entries
search-store.ts:54 update query to 249929 entries from 249929 entries
search-store.ts:54 update query to 251537 entries from 251537 entries
search-store.ts:54 update query to 253763 entries from 253763 entries
search-store.ts:54 update query to 255617 entries from 255617 entries
**search-store.ts:54 update query to 257054 entries from 257054 entries**
faces.ts:35 Face search: Took 19ms to select, 103ms to calculate, to sort 86ms, to map 0ms
faces.ts:35 Face search: Took 21ms to select, 88ms to calculate, to sort 88ms, to map 0ms
faces.ts:35 Face search: Took 20ms to select, 96ms to calculate, to sort 88ms, to map 0ms
faces.ts:35 Face search: Took 19ms to select, 96ms to calculate, to sort 86ms, to map 0ms
faces.ts:35 Face search: Took 21ms to select, 97ms to calculate, to sort 88ms, to map 0ms
...

update query to 92212 entries from 258659 entries
search-store.ts:54 update query to 92212 entries from 258660 entries
search-store.ts:54 update query to 92213 entries from 258661 entries
search-store.ts:54 update query to 92214 entries from 258662 entries
search-store.ts:54 update query to 92214 entries from 258663 entries
search-store.ts:54 update query to 92215 entries from 258664 entries
...

update query to 92523 entries from 259386 entries
faces.ts:35 Face search: Took 21ms to select, 101ms to calculate, to sort 93ms, to map 0ms
faces.ts:35 Face search: Took 21ms to select, 93ms to calculate, to sort 96ms, to map 0ms
...
more...

If I use Chrome on a desktop computer to access the site, this problem will not occur. There is no continuous log output, clicks are normal, and Chrome will not lose response.
And Chrome will enter a continuous state, memory usage increases by 3G, decreases by 1.4G, increases by 3G, and decreases by 1.xG

Chrome

I suspect this may be a page problem. I'll try to check it out and see if I can help you (of course I'm not sure if this is possible, haha)

@xemle
Copy link
Owner

xemle commented Jul 12, 2024

Hi @a18090 thank you for your update.

From your latest console logs I read that the database can be loaded with 259386 entries which sound good. A friend of mine has 400k image which should work, too.

Regarding your previous error

[2024-07-10 12:55:48.064]: cli.task.import Run import in watch mode. Start watching source dirs for file changes: /data/hdd/
/tmp/caxa/home-gallery/master/home-gallery/node_modules/@home-gallery/server/dist/api/database/read-database.js:78
        return cb(err);
               ^

TypeError: cb is not a function
    at /tmp/caxa/home-gallery/master/home-gallery/node_modules/@home-gallery/server/dist/api/database/read-database.js:78:16
    at /tmp/caxa/home-gallery/master/home-gallery/node_modules/@home-gallery/server/dist/api/database/read-database.js:13:14
    at /tmp/caxa/home-gallery/master/home-gallery/node_modules/@home-gallery/database/dist/database/read-database.js:9:14
    at /tmp/caxa/home-gallery/master/home-gallery/node_modules/@home-gallery/common/dist/fs/read-json-gzip.js:14:12
    at node:internal/util:519:12
    at process.processTicksAndRejections (node:internal/process/task_queues:82:21)

Node.js v20.13.1
Error: Error: Server exited with code 1 and signal null

I found the issue. The error happens when the database could not be read. Unfortunately the error message with the cause is swallowed by this bug so I can not tell why the database can not be read. A following fix will will change that.

My best guess is that the database can not be read due memory issues on the server component. Especially while you importing a new and even larger database. Than the server has to keep 2 large version in memory for a short time of period. This issue could be fixed with the environment variable NODE_OPTIONS=--max-old-space-size=4096 to run all node processes with max 4GB memory, e.g. by NODE_OPTIONS=--max-old-space-size=4096 ./gallery.js run server for the server.

I doubt that the database itself is corrupt because on database creation the new database is written to a temporary file which is than renamed to the target database filename. This is a usual way to provide a kind of atomic file creation. The rename should only happen on no error cases.

Would you mind to check your database with zcat database.db | jq .data[].id | wc -l? This should print the numbers of database entries and should prove that the data is correct.

@a18090
Copy link
Author

a18090 commented Jul 14, 2024

Hi @xemle

I've removed the problematic database and restored the backup, then rebuilt the database and now by zcat database.db | jq .data[].id | wc -l He has 357198

But that number doesn't seem right, I looked for JPGs by listing the files and there exists about 462626 files

The full file count with video is roughly 597817

@xemle
Copy link
Owner

xemle commented Jul 14, 2024

Can you check also the file index with zcat files.idx | jq .data[].filename | wc -l? How many files are indexed by the gallery?

@a18090
Copy link
Author

a18090 commented Jul 15, 2024

(base) root@cdn:/data/ssd/glh/config# zcat tdl.idx | jq .data[].filename | wc -l
461589

(base) root@cdn:/data/ssd/glh/config# zcat database.db | jq .data[].id | wc -l
357198

(base) root@cdn:/data/hdd/tdl# find . |wc -l
599809

(base) root@cdn:/data/hdd/tdl# find . -type f -name *.jpg |wc -l
460337

@xemle
Copy link
Owner

xemle commented Jul 17, 2024

Hi @a18090

thank you for the numbers. To summarize

  • 599809 Dir and files in total
  • 460337 jpg files
  • 461589 files are indexed (and seen) by the gallery
  • 357198 media files are created

The diff between jpg files and indexed files is that the gallery also index other files like meta files

The more important question is why there is a big gab between media files and index files since there should not be so much meta files.

My best guess is that the import was done in multiple steps. Currently the algorithm does not recover correctly the files after the input process is restarted (Did not find time/interest do implement it yet). Therefore it is recommended to rerun the full import after all media files have been processed.

Would you mind to rerun a full import via ./gallery.js run import and check the numbers again?

@a18090
Copy link
Author

a18090 commented Jul 18, 2024

Thanks, no problem, I'll try it.

@xemle
Copy link
Owner

xemle commented Jul 30, 2024

Hi @a18090 I've updated the master with a stream based database creation. This should be less memory demanding and your 400.000 images should work better to read and to update.

Please try out if your have any issues with the newest version

@a18090
Copy link
Author

a18090 commented Aug 5, 2024

Hi @xemle
I solved the file count issue and database.db grew to ~491236 Maybe png or other formats(My file also added ~490629:find . -type f -name *.jpg |wc -l
My server didn't have nodejs so I used your binary "Jul 25 14:14 version".

This is resource occupancy
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
16951 root 20 0 12.2g 1.0g 43708 S 0.0 0.8 16:02.33 node ....../gallery.js server
16962 root 20 0 14.0g 2.6g 42752 S 0.0 2.1 2:17.11 node ....../gallery.js run import --initial --update --watch

I noticed an error occurred

{"level":50,"time":"2024-08-05T11:23:06.096Z","pid":11579,"hostname":"cdn","module":"database.media.date","levelName":"error","msg":"Could not create valid ISO8601 date '2023-01-21T00:11:77' from '\"2023:01:21 00:11:77\"' of entry e799fe8:tdl:xxx/File/IMG_2406.JPEG: RangeError: Invalid time value"}

I will also try multi-platform testing later

Thanks for your efforts

Hi @a18090

thank you for the numbers. To summarize

  • 599809 Dir and files in total
  • 460337 jpg files
  • 461589 files are indexed (and seen) by the gallery
  • 357198 media files are created

The diff between jpg files and indexed files is that the gallery also index other files like meta files

The more important question is why there is a big gab between media files and index files since there should not be so much meta files.

My best guess is that the import was done in multiple steps. Currently the algorithm does not recover correctly the files after the input process is restarted (Did not find time/interest do implement it yet). Therefore it is recommended to rerun the full import after all media files have been processed.

Would you mind to rerun a full import via ./gallery.js run import and check the numbers again?

@kryztoval
Copy link

I am in the same boat, already reaching 600k files (out of 2 million images) and I am already seeing the slowness. The impact is huge on the browser, it had no issues with less than 300k images but as soon as I hit 500k it started slowing down.

I am using the latest version and I am seriously considering setting the database into a database engine or sending the browser only the information it needs. I keep seeing the browser update the positioning of the recently loaded images and slowing down more and more

@kryztoval
Copy link

kryztoval commented Apr 16, 2025

So, my numbers:


# zcat index.idx | jq .data[].filename | wc -l
548347

# zcat database.db | jq .data[].id | wc -l
543566

# find /mnt/photos/ |wc -l
5313979

# find /mnt/photos/ -type f -iname \*.jpg -or -iname \*.jpeg -or -iname \*.heic -or -iname \*.png  |wc -l
4137370

I also excluded mkv and mp4 files because it was clogging my cache. I would love to set those to use the file instead of transcoding it. But alas, I am in a privileged position to test it with a lot of files. :)

@xemle
Copy link
Owner

xemle commented Apr 16, 2025

Hi @kryztoval

Thank you for your input and your numbers. Your numbers are the highest so far reported. Congratulations! You rule the board.

As stated in #134 (comment) the memory consumption of database creation was improved. Now I pushed an improvement of offline database to the master.

However the architecture and database design decision of the gallery was not to serve more than my own collection of 100K media entries. My original assumption was that all required data can be load into the browser so that the filtering and sorting of the media can be done quickly within the browser. Server request and a server based database and database scaling was out of scope. At the end currently the server requires to read the whole database file and keep and serve it in memory...

So your 5.3M photos outnumbers my assumption and experience by factor of 50. This is a lot.

It is very tricky to provide advises or even solutions if 5M images can be handled by the gallery or not. So I would like to return to the start, want to know what kind of problem do you like to solve and how does HomeGallery fits in your needs? Maybe there is a unseen possibility to customize it and strip down data. E.g. the AI features of face, object and similarity detection require a lot of bytes. By disabling them the overall performance should be improving. But I do not know if this is an option...

@kryztoval I guess you were testing different open source web galleries out there, too? How do they perform on your large media directory?

@kryztoval
Copy link

@xemle Yes, indeed. I understand. And yeah the memory consumption of the database was improved a lot, I did see that.

I do not really require that much, the things that home-gallery did incredible was its speed loading images. I would love it if it could play the original video without converting it as well, but that is it. I tried several solutions and even thought of making my own but I always struggled with the interface part and your project nailed the interface almost to a T. I used the tags to mark actions on the filesystem, allowing me to delete files directly and removing them from the list of images with ease.

I wanted it to be in a database so I could index the files myself. I will try and take a look at the interface and see if I can make it use a database instead of loading fully from memory. It sounds like a great challenge!

I will keep on the lookout for other galleries too. Thank you so much! :)

@xemle
Copy link
Owner

xemle commented Apr 18, 2025

Hi @kryztoval

Thank you for your response and thank you for your positive feedback about the gallery.

Regarding the database: The general challenge is that a query should be performed fast. To be fast, it should be in a fast memory. Depending on the data size there will be always the threshold where the data can not be handled by a single machine due the limitation of memory. If it does not fit into the memory the data needs to be distributed to multiple machines.

HomeGallery serves a list of private images and the assumption is that the amount of private images is limited. Therefore the data should fit into the memory. Further the assumption is that the amount of images is so "small" that the data can be loaded into the browser memory so that the database queries can be performed in the browser. Even complex queries with different boolean expression and similarity vector comparisons.

The numbers of my setup are:

  • 100.000 images
  • 50 MB database (gzip json)
  • 250 MB uncompressed database
  • 2.400 Bytes per database entries in average

For larger image sets the question is: Does this setup scale without changing the basic setup? It will scale up to a specific factor. My gut feeling is that it could scale to the factor of 4. Factor 10 is critical.

Your factor is 50 and I would be impressed if the current setup could handle that. My assumption it wont. I do not know if a database like sqlite or postgres on a single host can handle the current query complexity in a recent time.

So what are the options IMHO:

Reduce the data per entry

A database entry can be reduced to a bare minimum with

  • id (40 bytes)
  • 1 filename (64 bytes) with a quite flat file hierarchy
  • date (24 bytes)
  • preview sizes (20 bytes)
  • tags (20 bytes)
  • json keys (25 bytes)

This sums up to 200 bytes.

This is a factor of 10. The query is limited to mainly date and tags. No similarity.

The question here if the query times as sufficient in the browser? Currently a query takes up to 200ms. Factor 10 would be 2 sec in worst case.

The other option is to move the database handling to the server.

Does the current setup can handle a database with 5 M entries in a recent time? I doubt.

The complexity to move the data to a standard database like sqlite or postgres is high. It turns the current architecture up side down, introduces new dependencies and both code paths needs to be supported (a 5 M database is more an edge case then the average for a private image gallery, so I would keep the browser based database feature)

One option could be to split the database into different gallery instances. NodeJS is single threaded while current systems provide a few CPU cores. Splitting the gallery instances can be done by directories. So you can keep the query complexity and the server component already supports the same query language.

So the query needs to be split to multiple instances and the result needs to be merged in one place. The merge needs to evaluate the sort order again since in the current result some order keys are not transported in the result like the similarity result.

Depending on the performance the gallery can be distributed to different machines if the performance becomes poor.

Split your database to multiple instances sounds promising to me and requires following changes

  • Move the query to server (medium complexity)
  • Distribute the query to multiple instances (simple complexity)
  • Merge the result (medium complexity)

Maybe introduce a search result paging which includes only some result but provides meta data like the total matching amount so that the view can simulate the "height" of the result page

What do you think? Which solution do you have in mind? You said that your want to implement things by your own: How good are your programming skills?

@kryztoval
Copy link

I looked at the structure of your database and I see that you are using json, so basically this is a document database, as such I think the fastest way to put this into a database for me would be to use mongo or couchdb because they would not need to be translated at all.

I also took a look at the queries made by the browser to the api node and I think I could make the api node gather the information from the document database too.

mongodb and couchdb have the added benefit of being shardable and distributed in a semi-transparent way to the client.

I think that for a single person a local instance of mongodb would be way more efficient than having all the information in memory. I did a little project a few months back where I used static html files to interact with a restful service that forwarder queries either to the original API or gathered data from the mongodb records, that mongo server right now contains around 73807067 documents just in one document store and it responds at a decent pace.

I was thinking that implementing this in the api server would be a good place to leave the logic.

My programming skills are ok, but your gallery has a lot of moving parts. It would take me a bit to get familiar with all the things it does.

@kryztoval
Copy link

@xemle I kept playing with it, I ran the database without the images and... even at over 1 million entries it did not halt the browser. It seems that the thing slowing it down is the amount of loaded images. I am not sure how to tell react that once an image goes out of the IntersectionObserver that its src should be changed to a transparent gif or some other small image so the browser can clear that image memory to test out if that improves the situation. the intersectionobserver could be 1 extra display up and 1 extra display down and it would still have enough freed images.

I do not think the database is the reason for this stutter in the browser now.

@xemle
Copy link
Owner

xemle commented Apr 19, 2025

@kryztoval

Awesome that your are trying different things. A nosql database like MongoDB or CouchDB sounds promising. Please keep me updated.

I am not sure how to tell react that once an image goes out of the IntersectionObserver that its src should be changed to a transparent gif or some other small image so the browser can clear that image memory to test out if that improves the situation.

HomeGallery uses a virtual scrolling. So only the images in the current view are rendered. You can check the DOM in the browser via the developer tools. Maybe these tools would also highlight performance bottle necks.

I do not think the database is the reason for this stutter in the browser now.

If so I am happy to help to solve this bottleneck...

If you continue working on this topic, I would also suggest to switch this discussion to discord. We can have also sessions via the voice channel.

@kryztoval
Copy link

HomeGallery uses a virtual scrolling. So only the images in the current view are rendered. You can check the DOM in the browser via the developer tools. Maybe these tools would also highlight performance bottle necks.

Yes, I saw that, but the object seems to be still referenced, or cached. Because the memory used by the tab is over 1.5GB for some reason. And when it gets a bit over Chrome will start having issues rendering anything. Literally, even requests will not get queued. So I was thinking it would be worth a try to set the image to a transparent image to see if the browser can free it up earlier.

If you continue working on this topic, I would also suggest to switch this discussion to discord. We can have also sessions via the voice channel.

Ok, I will see you over there. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants