Commit graph

21 commits

Author SHA1 Message Date
Pavel Krajcevski
663caada50 Generalize BPTC compression.
1. Split compression parameter generation and compression parameter
packing. This gives a good performance boost, since we don't pack every
single time we compress. The error is computed each time, and only the
best parameters are packed.

2. Allow the shape selection function to specify up to ten shapes to
try for compression. We were already doing this kind of hackily where
we allowed both a three and two partition shape. This makes it a little
cleaner and exposes it to the user.
2014-03-25 11:40:06 -04:00
Pavel Krajcevski
d03732fc09 Move BPTC shapes header to include folder 2014-03-22 21:17:46 -04:00
Pavel Krajcevski
220a736a36 Move the other BPTC settings into the settings struct 2014-03-22 19:52:58 -04:00
Pavel Krajcevski
9144db4de6 Actually pass block coordinates to shape selection function 2014-03-22 19:25:21 -04:00
Pavel Krajcevski
cf937f2ad3 Refactor shape and mode selection
We suffered another performance hit. This time it comes from the fact
that we're copying around a lot of data based on what partition we're
choosing. We can get rid of this a tad by only copying the data that we
need once and then using getters/setters that selectively pull from
an array based on our shape index.
2014-03-21 18:02:02 -04:00
Pavel Krajcevski
26e816b3db Add settings for BPTC compression 2014-03-21 12:45:47 -04:00
Pavel Krajcevski
f12ee09f7e Some formatting and rearrange the BPTC code to be more structured like the others 2014-01-21 14:46:25 -05:00
Pavel Krajcevski
6794a0fffb Add hooks to NVTT bc7_export library if present on the users machine. Assumes that all of the cross platform problems are fixed for incorporation into FasTC... Otherwise the options to use NVTT are ignored. 2013-11-19 12:03:03 -05:00
Pavel Krajcevski
a80944901e Refactor CompressionJob struct.
In order to better facilitate the change from block stream order to non-block stream order,
a lot of changes were introduced to the way that we feed texture data to the compressors. This
data is embodied in the CompressionJob struct. We have made it so that the compression job
points to both the in and out pointers for our compressed and uncompressed data. Furthermore,
we have made sure that the struct also contains the format that its compressing for, so that if
any threading programs would like to chop up a compression job into smaller chunks based on the
format, it doesn't need to know the format explicitly, it just needs to know certain properties
about the format.

Moreover, the user can now define the start and end pixels from which we would like to compress
to. We can compress subsets of data by changing the in and out pointers and the width and height
values. The compressors will read data linearly until they reach the out pixels based on the width
of the given pixel.
2013-11-08 16:31:19 -05:00
Pavel Krajcevski
28cf254fe5 Initial decoupling of base library from core library. Includes a few formatting changes as well. 2013-09-13 19:36:37 -04:00
Pavel Krajcevski
0304bd4187 Refactor a bunch of things to renforce a bunch of style rules. 2013-08-26 16:11:39 -04:00
Pavel Krajcevski
f1f1294b2e Add tab formatting. 2013-08-22 18:33:42 -04:00
Pavel Krajcevski
ae2324153d Repurpose the rest of our scaffolding to use Compression Jobs 2013-03-09 13:36:39 -05:00
Pavel Krajcevski
435f935de3 Update atomics compression algorithm
In general, we want to use this algorithm only with self-contained compression
lists. As such, we've added all of the proper synchronization primitives in
the list object itself. That way, different threads that are working on the
same list will be able to communicate. Ideally, this should eliminate the
number of user-space context switches that happen. Whether or not this is
faster than the other synchronization algorithms that we've tried remains
to be seen...
2013-03-09 13:34:10 -05:00
Pavel Krajcevski
53fe825e49 Add first pass of atomic implementation.
This is a first pass of what I believe to be a not too terrible
implementation of a cooperative thread-based compressor. The idea is
simple... If a compressor is invoked with the same parameters on multiple
threads, then the threads cooperate via an atomic counter to compress the
texture. Each thread can take as long as possible until the texture is finished.

If a caller calls a compression routine that has different parameters, then
it will help the current compression finish before starting on its own compression. In this
way, we can split the textures up among the threads and guarantee that we maximize the
resource usage between them. I.e. this becomes more efficient:

Thread 1:    Thread 2:   Thread N:
  tex0         texN        tex(N-1)N
  tex1         texN+1      tex(N-1)(N+1)
  ..           ..          ..
  texN-1       tex2N       tex(N-1)N

I have not tested this for bugs, so I'm still not completely convinced that it is deadlock-free
although it should be...
2013-03-06 18:47:15 -05:00
Pavel Krajcevski
8cad373e8e Small refactoring changes. 2013-02-05 21:54:06 -05:00
Pavel Krajcevski
5eba3ba6f7 Add license 2012-11-15 11:51:55 -05:00
Pavel Krajcevski
1e6a2d4c7b Add new compression function that collects preliminary stats. 2012-10-31 17:48:52 -04:00
Pavel Krajcevski
341842d725 Make sure to not even compile the definition for the SIMD function. 2012-09-13 17:43:58 -04:00
Pavel Krajcevski
87375f4c14 Change signed to unsigned in order to match the function pointer typedef prototype.
Changed the function prototype to match that of the typedef in the rest of the library, and fixed a bug where we would iterate too far with the initial buffer.
2012-08-28 19:40:00 -04:00
Pavel Krajcevski
efdca4b5bb Initial commit with a few modifications 2012-08-24 15:56:45 -04:00