FasTC

mirror of https://github.com/yuzu-emu/FasTC synced 2024-11-23 02:43:40 +00:00

Author	SHA1	Message	Date
Pavel Krajcevski	663caada50	Generalize BPTC compression. 1. Split compression parameter generation and compression parameter packing. This gives a good performance boost, since we don't pack every single time we compress. The error is computed each time, and only the best parameters are packed. 2. Allow the shape selection function to specify up to ten shapes to try for compression. We were already doing this kind of hackily where we allowed both a three and two partition shape. This makes it a little cleaner and exposes it to the user.	2014-03-25 11:40:06 -04:00
Pavel Krajcevski	d03732fc09	Move BPTC shapes header to include folder	2014-03-22 21:17:46 -04:00
Pavel Krajcevski	220a736a36	Move the other BPTC settings into the settings struct	2014-03-22 19:52:58 -04:00
Pavel Krajcevski	9144db4de6	Actually pass block coordinates to shape selection function	2014-03-22 19:25:21 -04:00
Pavel Krajcevski	cf937f2ad3	Refactor shape and mode selection We suffered another performance hit. This time it comes from the fact that we're copying around a lot of data based on what partition we're choosing. We can get rid of this a tad by only copying the data that we need once and then using getters/setters that selectively pull from an array based on our shape index.	2014-03-21 18:02:02 -04:00
Pavel Krajcevski	26e816b3db	Add settings for BPTC compression	2014-03-21 12:45:47 -04:00
Pavel Krajcevski	f12ee09f7e	Some formatting and rearrange the BPTC code to be more structured like the others	2014-01-21 14:46:25 -05:00
Pavel Krajcevski	6794a0fffb	Add hooks to NVTT bc7_export library if present on the users machine. Assumes that all of the cross platform problems are fixed for incorporation into FasTC... Otherwise the options to use NVTT are ignored.	2013-11-19 12:03:03 -05:00
Pavel Krajcevski	a80944901e	Refactor CompressionJob struct. In order to better facilitate the change from block stream order to non-block stream order, a lot of changes were introduced to the way that we feed texture data to the compressors. This data is embodied in the CompressionJob struct. We have made it so that the compression job points to both the in and out pointers for our compressed and uncompressed data. Furthermore, we have made sure that the struct also contains the format that its compressing for, so that if any threading programs would like to chop up a compression job into smaller chunks based on the format, it doesn't need to know the format explicitly, it just needs to know certain properties about the format. Moreover, the user can now define the start and end pixels from which we would like to compress to. We can compress subsets of data by changing the in and out pointers and the width and height values. The compressors will read data linearly until they reach the out pixels based on the width of the given pixel.	2013-11-08 16:31:19 -05:00
Pavel Krajcevski	28cf254fe5	Initial decoupling of base library from core library. Includes a few formatting changes as well.	2013-09-13 19:36:37 -04:00
Pavel Krajcevski	0304bd4187	Refactor a bunch of things to renforce a bunch of style rules.	2013-08-26 16:11:39 -04:00
Pavel Krajcevski	f1f1294b2e	Add tab formatting.	2013-08-22 18:33:42 -04:00
Pavel Krajcevski	ae2324153d	Repurpose the rest of our scaffolding to use Compression Jobs	2013-03-09 13:36:39 -05:00
Pavel Krajcevski	435f935de3	Update atomics compression algorithm In general, we want to use this algorithm only with self-contained compression lists. As such, we've added all of the proper synchronization primitives in the list object itself. That way, different threads that are working on the same list will be able to communicate. Ideally, this should eliminate the number of user-space context switches that happen. Whether or not this is faster than the other synchronization algorithms that we've tried remains to be seen...	2013-03-09 13:34:10 -05:00
Pavel Krajcevski	53fe825e49	Add first pass of atomic implementation. This is a first pass of what I believe to be a not too terrible implementation of a cooperative thread-based compressor. The idea is simple... If a compressor is invoked with the same parameters on multiple threads, then the threads cooperate via an atomic counter to compress the texture. Each thread can take as long as possible until the texture is finished. If a caller calls a compression routine that has different parameters, then it will help the current compression finish before starting on its own compression. In this way, we can split the textures up among the threads and guarantee that we maximize the resource usage between them. I.e. this becomes more efficient: Thread 1: Thread 2: Thread N: tex0 texN tex(N-1)N tex1 texN+1 tex(N-1)(N+1) .. .. .. texN-1 tex2N tex(N-1)N I have not tested this for bugs, so I'm still not completely convinced that it is deadlock-free although it should be...	2013-03-06 18:47:15 -05:00
Pavel Krajcevski	8cad373e8e	Small refactoring changes.	2013-02-05 21:54:06 -05:00
Pavel Krajcevski	5eba3ba6f7	Add license	2012-11-15 11:51:55 -05:00
Pavel Krajcevski	1e6a2d4c7b	Add new compression function that collects preliminary stats.	2012-10-31 17:48:52 -04:00
Pavel Krajcevski	341842d725	Make sure to not even compile the definition for the SIMD function.	2012-09-13 17:43:58 -04:00
Pavel Krajcevski	87375f4c14	Change signed to unsigned in order to match the function pointer typedef prototype. Changed the function prototype to match that of the typedef in the rest of the library, and fixed a bug where we would iterate too far with the initial buffer.	2012-08-28 19:40:00 -04:00
Pavel Krajcevski	efdca4b5bb	Initial commit with a few modifications	2012-08-24 15:56:45 -04:00

21 commits