Foobar 2.0 tag caching

Topic: Foobar 2.0 tag caching (Read 3500 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

Foobar 2.0 tag caching

2022-09-26 10:28:43

As far as I have seen, the database has been changed on Foobar 2.0 and now tags are retrieved entirely from files, bypasssing the old tag length limits... which is great for some use-cases like fingerpringting and lyrics.

But now many actions which were previously done on a few ms easily take x50 or x100 more time on foobar 2.0.
- Complex queries
Spoiler (click to show/hide)

- Any plugin which uses tags from entire library: foo_dbsearch, foo_httpcontrol, ....
- foo_upnp: now incraeses the startup time by +30 - 60 secs (previously it was mostly instant).
- Any SMP/JS script relying on tags: For ex. some of my scripts which previously took 2 secs to create complex playlists with harmonic mixing on 70K tracks, now take 10-20 secs on foobar2000. Creating an ad hoc cache on JSON brings back the processing time to 2-3 secs. I know JSP has been updated to improve the situation, but that's just a patch to a design problem.

Current situation implies foobar2000 may take somewhere between 1 -2 minutes to load on startup as soon as you have a few complex autoplaylists or such plugins.

Peter, could you consider adding tag caching for configurable tags in top of the current database? I'm no talking about porting back the old tag caching with length limits... but simply caching the entire tag values for specific tags the user wants on startup; i.e. reading those tags from the database and making them persistent on ram during the entire session.

The point of using x64 binaries without ram limits is making use of all available memory to improve the user experience, which now is worse than before due to the absence of caching. Current x86 2.0 version also uses less ram now, so it can be safely increased for some tags. Right now, I see no point on updating to 2.0, since performance is worse and many plugins still lack an update.

Re: Foobar 2.0 tag caching

Reply #1 – 2022-09-29 10:45:11

And this is particularly the same issue I was talking about here. the size of the database has multiplied and media search takes more time now which is upsetting thing.
I'm tag maniac, so most of tracks were filled with some info including lyrics e.t.c.. Moreover I tested fb2k v2 on RAM drive, couldn't imagine how long it would take to search on HDD or SATA SSD with these ~400 MB by library database.

Re: Foobar 2.0 tag caching

Reply #2 – 2022-12-06 11:47:57

Quote from: regor on 2022-09-26 10:28:43

Creating an ad hoc cache on JSON brings back the processing time to 2-3 secs. I know JSP has been updated to improve the situation, but that's just a patch to a design problem.

There you have it - it's still possible to deliver acceptable performance using new batch info query methods that I provided, without caching all the info all the time. If I give up right away and cache everything like old foobar2000 did, nobody will use these.
Providing zero cost access to all info was the design problem, made everyone abuse it, even my old code hammered metadb with multiple queries per processed track.
Before, foobar2000 startup was slow due to huge library being loaded regardless of user interface in use.
Now, if your user interface doesn't block waiting for data, startup is instant. And this is bad design according to you, right?

Re: Foobar 2.0 tag caching

Reply #3 – 2022-12-06 11:57:38

Quote from: Peter on 2022-12-06 11:47:57

Quote from: regor on 2022-09-26 10:28:43
Creating an ad hoc cache on JSON brings back the processing time to 2-3 secs. I know JSP has been updated to improve the situation, but that's just a patch to a design problem.
There you have it - it's still possible to deliver acceptable performance using new batch info query methods that I provided, without caching all the info all the time. If I give up right away and cache everything like old foobar2000 did, nobody will use these.
Providing zero cost access to all info was the design problem, made everyone abuse it, even my old code hammered metadb with multiple queries per processed track.
Before, foobar2000 startup was slow due to huge library being loaded regardless of user interface in use.
Now, if your user interface doesn't block waiting for data, startup is instant. And this is bad design according to you, right?

Not sure what you understood of my tests, but it's the opposite.

I had to cache the ENTIRE library tags on JSON to get back acceptable performance. I'm not talking about logical and sane coding practices of caching things which are gonna be reused, but caching all tags of all the library (or at least those which are gonna be used).
And since tag caching is slow on JS, due to the new fb v2 methods, then it's done once and done. When path changes, it gets rebuilt.

Code: [Select]

'use strict';
//06/12/22

include('callbacks_xxx.js');
include('helpers_xxx.js');
include('helpers_xxx_file.js');
include('helpers_xxx_crc.js');
if (!isFoobarV2) {console.log('Tags Cache is being used on foobar <2.0. This is not recommended.');}

// Tags cache
// Tag retrieval is too slow when retrieving tags on Foobar 2.0+
const tagsCache = {
	files: {}, // one per tag
	folder: fb.ProfilePath + 'js_data\\tagsCache\\',
	cache: new Map(), // [ID, [Value, ...], ...], where ID = handle.RawPath + handle.SubSong
	filesCRC: {},
	currCRC: {},
	toStr: {},
	updateFromhook: false,
	enabled: false,
	listeners: []
};

_createFolder(tagsCache.folder);
_save(tagsCache.folder + '_XXX-SCRIPTS_CACHE_FILES', null); // Add info files

tagsCache.cacheTags = function (tagNames, iSteps, iDelay, libItems = fb.GetLibraryItems().Convert(), bForce = false) {
	if (!isArray(libItems) || !libItems.length) {return null;}
	return new Promise((resolve) => {
		let items = [];
		// Filter only items not cached before
		tagNames.forEach((tag) => {
			if (bForce) {
				if (!this.cache.has(tag)){this.cache.set(tag, new Map());}
				items = libItems;
			} else {
				if (this.cache.has(tag)){
					const tagCache = this.cache.get(tag);
					libItems.forEach((item) => {
						if (!tagCache.has(item.RawPath + ',' + item.SubSong)) {items.push(item);}
					});
				} else {
					this.cache.set(tag, new Map());
					items = libItems;
				}
			}
		;})
		const count = items.length;
		const total = Math.ceil(count / iSteps);
		const promises = [];
		const tf = tagNames.map((tag) => {return fb.TitleFormat(tag);})
		if (!count) {promises.push('done');}
		else {
			let prevProgress = -1;
			for (let i = 1; i <= total; i++) {
				promises.push(new Promise((resolve) => {
					setTimeout(() => {
						tagNames.forEach((tag, j) => {
							const tagCache = this.cache.has(tag) ? this.cache.get(tag) : null;
							if (tagCache && tagCache.size !== count) {
								const iItems = new FbMetadbHandleList(items.slice((i - 1) * iSteps, i === total ? count : i * iSteps));
								const newValues = tf[j].EvalWithMetadbs(iItems);
								newValues.forEach((newVal, h) => {
									const item = iItems[h];
									tagCache.set(item.RawPath + ',' + item.SubSong, newVal.split(', '));
								});
								const progress = Math.round(i / total * 100);
								if (progress % 10 === 0 && progress > prevProgress) {prevProgress = progress; console.log('Caching tags ' + progress + '%.');}
								resolve('done');
							}
						});
					}, iDelay * i);
				}));
			}
		}
		Promise.all(promises).then(() => {
			console.log('cacheTags: got ' + JSON.stringify(tagNames) + ' tags from ' + count + ' items.')
			tagNames.forEach((tag) => {
				this.updateCacheCRC(tag);
			});
			resolve(this.cache);
		}, (error) => {new Error(error);});
	});
}

tagsCache.getTags = function (tagNames, libItems, bFillWithTF = true) {
	const tags = Object.fromEntries(tagNames.map((tag) => {return [tag, []];}))
	const tf = bFillWithTF ? tagNames.map((tag) => {return fb.TitleFormat(tag);}) : null;
	tagNames.forEach((tag, i) => {
		if (tag.toLowerCase() === 'skip') {return;}
		let bUpdate = false;
		if (this.cache.has(tag)) {
			const tagCache = this.cache.get(tag);
			libItems.forEach((item) => {
				const id = item.RawPath + ',' + item.SubSong;
				const bCached = tagCache.has(id);
				const newVal = bCached ? tagCache.get(id) : (bFillWithTF ? tf[i].EvalWithMetadb(item).split(', ') : null);
				if (!bCached) {tagCache.set(id, newVal); bUpdate = true;}
				tags[tag].push(newVal);
			});
		} else if (bFillWithTF) {
			this.cache.set(tag, new Map());
			const tagCache = this.cache.get(tag);
			libItems.forEach((item) => {
				const id = item.RawPath + ',' + item.SubSong;
				const newVal = tf[i].EvalWithMetadb(item).split(', ');
				tagCache.set(id, newVal);
				tags[tag].push(newVal);
			});
			bUpdate = true;
		}
		if (bUpdate) {
			this.updateCacheCRC(tag);
		}
	});
	return tags;
}

tagsCache.clear = function (tagNames) {
	tagNames.forEach((tag) => {
		if (this.cache.has(tag)) {this.cache.delete(tag);}
		this.cache.set(tag, new Map());
		this.updateCacheCRC(tag);
	});
}

tagsCache.deleteTags = function (tagNames, libItems) {
	tagNames.forEach((tag) => {
		if (this.cache.has(tag)){
			const tagCache = this.cache.get(tag);
			libItems.forEach((item) => {
				const id = item.RawPath + ',' + item.SubSong;
				if (tagCache.has(id)) {tagCache.delete(id);}
			});
			this.updateCacheCRC(tag);
		}
	});
}

tagsCache.load = function (folder = this.folder) {
	this.enable();
	if (!_isFolder(folder)) {return;}
	const files = getFiles(folder, new Set(['.json']));
	files.forEach((filePath) => {
		new Promise((resolve) => {
			const fileName = utils.SplitFilePath(filePath).slice(1).join();
			const file = _open(filePath, utf8)
			const obj = _jsonParse(file);
			if (obj) { 
				const tag = obj.tag;
				const entries = obj.entries;
				if (!this.cache.has(tag)) {this.cache.set(tag, new Map(entries));}
				else {
					const tagCache = this.cache.get(tag);
					entries.forEach((pair) => {tagCache.set(pair[0], pair[1]);});
				}
			}
			this.filesCRC[fileName] = this.currCRC[fileName] = crc32(file);
			this.toStr[fileName] = file;
			this.files[fileName] = filePath;
			resolve('done');
		});
	});
	console.log('Tags Cache loaded.');
}

tagsCache.unload = function () {
	[...this.cache.keys()].forEach((tag) => {
		this.cache.get(tag).clear();
	});
	this.cache.clear();
	this.files = this.filesCRC = this.currCRC = this.toStr = {};
	this.enabled = false;
	this.listeners.forEach((listener) => {removeEventListener(listener.event, null, listener.id);});
	this.listeners = [];
}

tagsCache.enable = function () {
	this.enabled = true;
	this.listeners = [
		// Auto-update cache
		addEventListener('on_library_items_added', (handleList) => {
			if (!this.enabled) {return;}
			const keys = [...this.cache.keys()];
			if (keys.length) {
				this.cacheTags(keys, iStepsLibrary, iDelayLibrary, handleList.Convert(), true);
			}
		}),
		addEventListener('on_metadb_changed', (handleList, fromHook) => {
			if (!this.enabled) {return;}
			if (!this.updateFromhook && fromHook) {return;}
			const keys = [...this.cache.keys()];
			if (keys.length) {
				this.cacheTags(keys, iStepsLibrary, iDelayLibrary, handleList.Convert(), true);
			}
		}),
		addEventListener('on_library_items_removed', (handleList) => {
			if (!this.enabled) {return;}
			const keys = [...this.cache.keys()];
			if (keys.length) {
				this.deleteTags(keys, handleList.Convert());
			}
		}),
		addEventListener('on_script_unload', () => {
			if (!this.enabled) {return;}
			this.save();
		})
	];
}

tagsCache.disable = function () {
	this.enabled = false;
}

tagsCache.save = function (folder = this.folder) {
	for (let fileName in this.currCRC) {
		if (this.filesCRC[fileName] !== this.currCRC[fileName]) {
			if (!this.files.hasOwnProperty(fileName)) {this.files[fileName] = folder + fileName + '.json';}
			_deleteFile(this.files[fileName], true);
			_save(this.files[fileName], this.toStr[fileName]);
		}
	}
}

tagsCache.updateCacheCRC = function (tag) {
	const key = _asciify(tag);
	const tagCache = this.cache.get(tag);
	this.toStr[key] = JSON.stringify({tag, entries: tagCache}, MapReplacer, '\t');
	this.currCRC[key] = crc32(this.toStr[key]);
}

Get your point Peter, but you are not getting mine.

If you have a process which requires ALL genre tags on library, there is no way caching on demand solves anything. First call will take 10 seconds no matter what you do, because now it's not free. Yep you can cache it for the second time.

But what about requiring ALL genre tags from just part of the library? You have to create a cache linked somehow to UUIDs per track. Or something more complex:
Spoiler (click to show/hide)

And to work on first call, it must be done on init, once. What I did ok, but then... this is just full library tag caching, the thing you removed XD

I'm just saying all these should be added back built-in, for configurable tags. Let the user choose if they want to have 10 seconds startup more to have free tag access aftewards or not.

Re: Foobar 2.0 tag caching

Reply #4 – 2022-12-06 12:01:44

Quote from: Strannik on 2022-09-29 10:45:11

And this is particularly the same issue I was talking about here. the size of the database has multiplied and media search takes more time now which is upsetting thing.
I'm tag maniac, so most of tracks were filled with some info including lyrics e.t.c.. Moreover I tested fb2k v2 on RAM drive, couldn't imagine how long it would take to search on HDD or SATA SSD with these ~400 MB by library database.

One of fun traits of the old search implementation is that it scales perfectly with extra CPU threads, so the whole thing runs stupidly fast on any PC made during the last decade.
foobar2000 v2.0 uses a library search index.... which means single-thread operation, at least during the index search phase. At least it won't hog all your CPU cores during app startup with lots of autoplaylists, like old versions did.

Re: Foobar 2.0 tag caching

Reply #5 – 2022-12-06 12:09:14

Quote from: Peter on 2022-12-06 12:01:44

Quote from: Strannik on 2022-09-29 10:45:11
And this is particularly the same issue I was talking about here. the size of the database has multiplied and media search takes more time now which is upsetting thing.
I'm tag maniac, so most of tracks were filled with some info including lyrics e.t.c.. Moreover I tested fb2k v2 on RAM drive, couldn't imagine how long it would take to search on HDD or SATA SSD with these ~400 MB by library database.

One of fun traits of the old search implementation is that it scales perfectly with extra CPU threads, so the whole thing runs stupidly fast on any PC made during the last decade.
foobar2000 v2.0 uses a library search index.... which means single-thread operation, at least during the index search phase. At least it won't hog all your CPU cores during app startup with lots of autoplaylists, like old versions did.

IS that an improvement? XD 2022, we are getting more cores not less. And we go single thread now?

The autoplaylist "problem" is your design decision, linked to always having all playlist loaded on UI, which is a design decision of yours too. That along the absence of a Playlist Manager capable of anything (apart from listing).
https://raw.githubusercontent.com/regorxxx/Playlist-Manager-SMP/main/readmes/playlist_manager.pdf
https://github.com/regorxxx/Playlist-Manager-SMP

There are other ways to have an Autoplaylist within foobar and only loaded on demand, thus not affecting startup. If it can be done on JS, it can be easily done on the main program. If you treat playlists as files, which can be loaded, then problem is gone.
Spoiler (click to show/hide)

With all due respect Peter, it seems to me like you justify "hiding" a bad design decision with another one. And we are going to slower methods with every iteration, using less ram and cores, which makes no sense.

Re: Foobar 2.0 tag caching

Reply #6 – 2022-12-06 12:20:40

Quote from: regor on 2022-12-06 11:57:38

Not sure what you understood of my tests, but it's the opposite.

OK, I get the idea what you did now.
Still, you can roughly benchmark how long it takes to read all library tags using proper methods on your specific config by putting all your music in a playlist then searching it (playlist search can't use index). I bet it takes less than a minute.

I'm not swearing by the new design and not caching tags.
In fact, I hate this situation, of users suffering from slow startup, as much as you do.
But,

It IS possible to deliver acceptable full library search performance if you use batch queries. The component hosting your JS should be updated to publish this functionality.
If I step back on first sign of trouble, I'm not really giving the new design any chance.

Re: Foobar 2.0 tag caching

Reply #7 – 2022-12-06 12:26:43

Also:
I'm not switching to single thread operation by own choice but because that's how SQLite works. If you know how to do SQLite LIKE searches in multiple threads, please let me know. Otherwise please stop trolling about bad design decisions.

Re: Foobar 2.0 tag caching

Reply #8 – 2022-12-06 20:57:31

From my experience working on SQLite, there is no way to execute a single query multithreaded-way. Access to SQLite can be multithreaded, but it won't help here (unless you manually split the search query somehow into separate queries, which will make it possible to run them concurrently afterwards)

Additional indexes or/and changing schema might improve query execution speed a bit, but it still won't be multithreaded

Re: Foobar 2.0 tag caching

Reply #9 – 2022-12-06 21:05:45

Playlist search = running multiple queries in parallel.
Library search = single threaded query with index.
Unfortunately, neither wins against old foobar2000 search with all data readily available in app memory and readable locklessly in any number of threads.

Re: Foobar 2.0 tag caching

Reply #10 – 2022-12-06 21:23:12

Tbh, I feel that caching would be needed for components that require frequent library searches. Though I'm not sure what would be the best way to implement it, i.e. make it caching by default (so that it would be invisible to devs and maybe could be reused by the older components) or make some API/SDK that will allow component developers a simple way to create corresponding cache (or maybe even reuse some global cache).
IMO, the latter is a more suitable choice, because not all components need such caching (or maybe even none for some setups).

[EDIT] PS: But I still want to try v2 methods myself in different scenarios and benchmark it against v1.6 foobar, before trying to implement any sort of cache =)

Re: Foobar 2.0 tag caching

Reply #11 – 2022-12-06 21:42:25

Quote from: regor on 2022-09-26 10:28:43

I know JSP has been updated to improve the situation, but that's just a patch to a design problem.

I hadn't seen this before but I guess it must in reference to this post I made some time ago where benchmarking new methods against old is possible. With each code snippet, the Print() function dumps the time taken to the fb2k console.

https://hydrogenaud.io/index.php/topic,120982.msg1016044.html#msg1016044

My updated method uses queryMultiParallel_ / formatTitle_v2 internally.

Since then, I've also updated my fb.GetLibraryItems method to take an optional query arg which uses the new library_index API internally. I asked about the benefits of this in the SDK thread and Peter gave me this answer...

https://hydrogenaud.io/index.php/topic,122761.msg1017949.html#msg1017949

I have no real way of comparing this against old GetQueryItems code because my collection is so pathetically small.

Re: Foobar 2.0 tag caching

Reply #12 – 2022-12-16 22:15:13

Quote from: marc2k3 on 2022-12-06 21:42:25

I have no real way of comparing this against old GetQueryItems code because my collection is so pathetically small.

Obviously problems only arise with big libraries (which is my case too), and having to wait +10 seconds to perform some complex queries and tag retrieval is not a great user experience for foobar v2 users on big libraries... specially when they have tested the same things on v1 executing on a few seconds. In that sense, even understanding all the points given, updating to v2 feels like a regression right now. And can not really recommend anything but going back to v1 to users who ask me about this on PM or email, reporting "slow" processing.

Agree with TheQwertiest about implementing tag cache over the SQL database, for components usage or as an optional global preference in case a user just wants extra performance. On x64 there is no more RAM limits, so no need for hacks like cutting tags at some length, and all that stuff like the old "LargeFieldsConfig" thing. Just cache all tags, and update when timestamp changes.

Can not compare the v2 methods anyway until SMP is updated, so that's out of my hands right now. Not going to port 10k lines of code to test it. I can only report that tag caching via JSON files with SMP on V2 brings back the performance to v1 levels, which should apply the same in JSP.

But obviously that method, apart from being and patch, is totally suboptimal. Ttracks must be identified by path + subsong, which then against must be retrieved from the database and the timestamps are retrieved via ActiveX Object, adding a lot of overhead. Also can only cache full TF expressions (unless I also mimic single tag retrieval and identification within TF expressions), so %ARTIST% is a different entry than %ARTIST% - %TITLE%. Multiply that for every item on library... Caching should be on the main program.

Re: Foobar 2.0 tag caching

Reply #13 – 2022-12-17 22:03:08

It is official:
After doing various benchmarks of new vs old foobar2000 performance, I decided that I'm stepping back on memory caching changes of foobar2000 v2.0, pretty much going back to v1.x era caching.
The next beta - ETA next week - will behave a lot differently. It will also come with search that's actually much faster than any past implementation this time.
Benefits of SQLite remain - no more slow shutdown due to lots of saving; startup is still fast because info won't be loaded until something asks for it.
It's still good to use queryMulti() methods, they're faster in case the info isn't cached yet, but no longer critical.

Re: Foobar 2.0 tag caching

Reply #14 – 2022-12-21 21:52:31

Beta 18 out. Have fun with it.

The search is definitely NOT slower than old foobar2000 now.

Re: Foobar 2.0 tag caching

Reply #15 – 2022-12-22 11:44:06

Great news

Have not been able to test it in my main system BUT downloaded it and noticed there is no more LargeFieldsConfig file, so... I suppose everything is tagged now? No limits on tag size? Is there some kind of configuration (like skip caching X tag)?

One of the problems I got trying to implement a plugin for ChromaPrint was the raw fingerprint was too large for caching (like +500 Mb on ram, plus other 500 Mb when querying it... leading to memory failures on SMP). Just found a solution for v1.6 this week using ffmpeg ffprobe on the physical files to extract the full tag and saving the results on multiple 100MB json files which then are read on demand. Creating the database takes 1h easily on 70k files, but the search for matches is just a few seconds with some tricks. If the tag can be finally read directly, really good news.

Re: Foobar 2.0 tag caching

Reply #16 – 2022-12-22 14:52:02

Everything LargeFieldsConfig.txt is now gone. Full tags are always read.

The original problem that I attempted to solve with LargeFieldsConfig.txt was not just memory usage but also extreme overhead of loading/saving large playlists & library state, on app startup/shutdown respectively. This is no longer an issue, loading has been made asynchronous, saving happens while running and no longer rewrites huge amounts of data.

Re: Foobar 2.0 tag caching

Reply #17 – 2022-12-22 15:19:14

I modified "LargeFieldsConfig.txt" a long time ago when the lyrics for several songs didn't show among other annoyances of mine. Last modified: "June 30, 2016". Might of been earlier when it first changed as well as the file only says when it was last changed. I back up my configuration and components I'm using, so that modified file originated from an older computer as well. After about 6 years we all no longer need to worry about "LargeFieldsConfig.txt" being a random, oh so that's why that's not showing up anymore, not that I haven't needed to change it in the 6 years after. Progress!!

Thanks @Peter !

Notice