|
ForumsSega Master System / Mark III / Game GearSG-1000 / SC-3000 / SF-7000 / OMV |
Home - Forums - Games - Scans - Maps - Cheats - Credits Music - Videos - Development - Hacks - Translations - Homebrew |
Author | Message |
---|---|
|
Snake-like motion of a chain of sprites
Posted: Fri Jan 20, 2023 5:18 pm
|
How would you go about if you wanted to move a string of sprites in a snake-like fashion and you neither want to use a huge amount of LUTs or too costly calculations?
I’m not talking about a simple sine pattern, I’m thinking more along the lines of the octopus's tentacle in Alex Kidd or the tail of the Dobkeratops in R-Type, so one end of the string is fixed and the other end swings around. Any ideas? |
|
|
Posted: Fri Jan 20, 2023 5:30 pm |
Do you have a little draw to clearly understand your idea?
PS. Ask to Calindro for AK octopuss, he may know how its animated. After a little reflexion, maybe just one LUT, each elem upd pos x/y (animated each 4, 6 or 8 frames, depends of the effect). UPD x/y pos can maybe be just 1 byte, as things will move "slowly" between each frame. Half byte for X and Y. Like %-000 for each half pair 4th bit is add(1) or sub(0), 3 bits for value (max =7px) + hope it fits in a 256 bytes space to make this simple (lol) |
|
|
Posted: Fri Jan 20, 2023 6:29 pm |
I'd be interested to see how the R-Type level 1 boss' tail differs from the giant worm thing in level 2, since the worm has to travel much further distances. I'd also have to check to see how smooth its turns are or if they're a bit chunky... | |
|
Posted: Fri Jan 20, 2023 9:22 pm |
Alex Kidd's octopus indeed uses a sine LUT for its tentacle positions. | |
|
Posted: Fri Jan 20, 2023 10:04 pm |
What exactly do you mean by "huge amount" for byte size in above? |
|
|
Posted: Sat Jan 21, 2023 12:20 am |
A few ideas from the top of my head: - Summing multiple sine waves with different frequencies and scaling factors can result in some pretty complex patterns; - Implementing some sort of "attraction" between the sprites could result in some interesting patterns, where the second sprite accelerates toward the first sprite and stops when it is close enough; same for third sprite in relation to the second, and so on; - A ring buffer on RAM can also be used for the movement patterns, where the second sprite is a few steps behind the first on the buffer, the third is a few steps behind the second, and so on. Of course, all of the above will require some amount of experimentation to get interesting results. |
|
|
Posted: Sat Jan 21, 2023 1:14 am |
"- Summing multiple sine waves with different frequencies and scaling factors can result in some pretty complex patterns"
Haroldloop, any scaled 8 bit function can be represented to arbitrary precision by a sum of sines (and hence cosines) where the frequencies can be chosen . So it is not surprising what you say - it's kinda of a well known math theorem. |
|
|
Posted: Sat Jan 21, 2023 8:13 am |
Well, the main difference is that the tail of the Dobkeratops is fixed on one end, while the Outslay accompanying the Gomander has its whole body move freely. I assume the latter's movement is basically just the head moving around and the other segments following with a little delay.
I couldn’t really put it into terms of byte size, but I was thinking along the lines of ichigobankai's suggestion: Pre-build a waggling animation manually, then read the x/y position of each segment for each frame from a LUT. I didn’t think of the clever idea of using just one byte for both x and y, but even then the resulting LUT won’t be small if you want to make the animation somewhat smooth. The other downside of this approach is that in the end it’s just a fixed animation, not the result of overlapping wave motions, which I imagine could look much more interesting. But maybe I’m overthinking things? Haroldo's approach is very close to what I’m aiming at. The question is how to achieve it efficiently both in terms of space and cpu time. I’m curious to hear more about how the octopus in Alex Kidd uses its LUT. If it’s not a table of pre-calculated positions but in fact a sine table, there must be some calculations translating the values into positions for the tentacle's segments. |
|
|
Posted: Sat Jan 21, 2023 8:51 am |
If you spend 256 bytes on an aligned table for a quarter of a sine wave then you can shift down for 128, 64, etc amplitude. You can also make the table smaller if you don’t need it. Or you might consider that ROM is more flexible than CPU per frame and have pre-baked animation LUTs with raw x, y values or even deltas. | |
|
Posted: Sat Jan 21, 2023 5:40 pm |
I did a very small demo some time ago with some kind of similar movement:
|
|
|
Posted: Sun Jan 22, 2023 12:04 am |
Though an sine wave can accurately minimally be described essentially as a 4 times repeating pattern with a rotation you can probably save some more memory by using a linear or small angle approximation for the part that looks more like a straight line but that would depend on the specifics of what Kagesan want exactly. "How would you go about if you wanted to move a string of sprites in a snake-like fashion and you neither want to use a huge amount of LUTs or too costly calculations? " the question is a bit too generic for me to decide if a using a small angle or linear approximation is suitable to save more on the 256 bytes (as Maxim suggests). " Haroldo's approach is very close to what I’m aiming at. The question is how to achieve it efficiently both in terms of space and cpu time. " Maybe someone wants to attempt to code for the desired "snake like motion" to maybe find out if a small angle or linear approximation helps for space or CPU time. |
|
|
Posted: Sun Jan 22, 2023 9:39 am |
Thank you, very interesting. I'll have to study that a bit.
Yeah, sorry. I'm not quite sure yet where this idea is going to lead eventually, so I'm afraid I can't be much more specific at this point. I guess the best way of implementation can only really be decided if a final form is settled on. Basically, I want to do something resembling this, but have the thing be composed of sprites like in kusfo's demo: https://www.youtube.com/watch?v=SEt9-YtG0Ro Nevertheless, this thread has already given me some good approaches to think about, and I'm open to more or better ones if someone can think of any. Thank you all. Do keep them coming if you have more suggestions. |
|
|
Posted: Sun Jan 22, 2023 10:42 am |
I would recommend first trying to do some experimentation without bothering too much about early optimization. Even it is proven to be too slow or memory intensive at the first few prototypes, it could give you a better idea of what looks good and at what cost, and could be a basis as to what direction to follow.
Please, do keep us updated. |
|
|
Posted: Sun Jan 22, 2023 10:51 am |
Also, as for the sum-of-waves idea:
- It the sums of senoids is shown to be too slow, one alternative would be to trade time for space, by having a few more LUTs containing precomputed complex waves, them sum those, instead of starting from the senoids; - One way to reduce repetition of the waves would be to use different frequencies for each, so the waves would "slide" over each other; - Avoid using integer multiples for the frequencies as it would put the waves in-sync, which would make them too predictable, unless that is the effect that you're targeting. |
|
|
Posted: Sun Jan 22, 2023 6:04 pm |
I don't know assembly... But a bezier curve requires fairly simple math to produce. Perhaps that would be an option? | |
|
Posted: Sun Jan 22, 2023 6:13 pm |
The required mathematical operations could be done on a Z80, but since it requires many multiplications for each point, they seem unlikely to be doable in real time on this processor, unless they're precomputed in some way... |
|
|
Posted: Mon Jan 23, 2023 3:03 am |
The Z80 not having multiply built in the silicon doesn't help but interestingly there are 8 bit CPUs that do have a hardware multiply instruction. |
|
|
Posted: Mon Jan 23, 2023 9:55 am |
I remembered that I also found some interesting source code in Python for an R-Type tail here:
https://github.com/Wireframe-Magazine/Wireframe-6/blob/master/source-code/tail.py |
|
|
Posted: Mon Jan 23, 2023 10:32 am |
if each sprite moves in fact in the same way as the others, just on a different frame, you might have a simple single delta list for the position and add the appropriate one according to each sprite own index into that table but I suspect it would be much nicer if you build a chain of sprites where each one has its own speed vector that is affected by the previous sprite position - which would be likely also more demanding on the CPU edit: this tweet could also be helpful! |
|
|
Posted: Mon Jan 23, 2023 6:13 pm |
Awesome! That might be exactly what I’m looking for. Thank you. @sverx: Thank you for the link to the thread, that is indeed interesting. Note that the Dobkeratops tail in the first post works backwards. It looks like the tip of the tail has the lead and drags the other segments with it, but in the original, if I recall correctly, the motion originates from the fixed part of the tail and it settings the rest around. |
|
|
Posted: Mon Jan 30, 2023 4:04 pm |
So, I tried to translate the python code kusfo linked to into asm and ...it actually works quite well. It's fun to experiment with the values and see how they influence the movement.
It's not *quite* right, though. It currently looks like a snake whose tail is nailed to the ground and who tries to wiggle free. It should look more like a vine swinging around, with the fixed end being the origin of the movement. I think this isn't my fault, it's in the example algorithm that works this way. Unfortunately, it's super slow. I haven't put much care into optimization yet. I've attached the source, just in case someone wants to take a look. |
|
|
Posted: Mon Jan 30, 2023 4:06 pm |
Thanks for providing source, this has been a interesting topic to read. |
|
|
Posted: Tue Jan 31, 2023 3:31 pm |
I took a screen capture of the example video linked from Twitter above, and reversed the playback. That gives you the effect you're looking for, with the movement originating from the fixed end. So I think in whatever way the hierarchy is structured in the algorithm, it needs to be reversed. |
|
|
Posted: Tue Jan 31, 2023 6:36 pm |
Thank you, that’s really interesting. I’ll keep that in mind when I get back to this for optimizations. |
|
|
Posted: Sat Feb 04, 2023 12:56 pm |
I think I got it now.
I re-wrote the sine/cosine calculation. It uses a 64 byte LUT and is fairly fast. I also changed the multiplication from a straightforward repeated addition to something more sophisticated. Thanks to Centrale's valuable hint I reversed the order in which the phase steps are applied to the individual segments and finally got the effect I was looking for. All this reduced the calculation of the whole tail to roughly a third of the original time, and it clocks in at an average of around 40 scanlines. While that's not exactly blazingly fast, it's not too bad either. More optimizations could easily be achieved: - Use a 256 byte LUT for sine/cosine calculation. - If your wobble amount value is a power of two, replace the multiplication with a bunch of add hl, hl. This alone shaves off around 10 scanlines. Source is included. Have fun experimenting with the various values. I've attached examples for both the classic Dobkeratops tail and a swinging vine. Unfortunately, the GIFs don't really do the whole thing justice. In reality, the movement is super-smooth. P.S.: Can someone with a Twitter account please tell the guy who is working on the R-Type conversion for the Mega Drive, that his Dobby's tail moves backwards? |
|
|
Posted: Sat Feb 04, 2023 5:43 pm |
It's beautiful! Nice job! :D | |
|
Posted: Sat Feb 04, 2023 11:26 pm |
Nice, it looks great.
I just did that for you! |
|
|
Posted: Sun Feb 05, 2023 9:22 am |
Thank you! | |
|
Posted: Sun Feb 05, 2023 1:57 pm |
A possible, alternative *faster* approach, if your wobble amount keeps constant all the time, is to have a 256-entries LUT of 8.8 fixed point values of k*sin(x) so you don't have to perform any multiplication at all. Yes, this requires 512 bytes, but if that's too much you can just store one quarter of the wave (it would be 128 byte). I also prefer to store 65 values in my quarter-LUTs so to avoid checking for special cases in the code. Yes, first value will always be 0 but it's a small waste of ROM space and it shaves off a few CPU cycles at every call. |
|
|
Posted: Sun Feb 05, 2023 3:29 pm |
That’s a good idea, but each segment needs two sin(x) and one cos(x) calculation to determine the position of the sprite. If I bake the multiplication into one of them, I need to do the same for the other one, so I need two huge LUTs. Honestly, I think I prefer versatility over speed in this case.
I’m not sure I understand this correctly. Wouldn’t you need at least a 130 byte LUT for that? The msb for each entry would always be zero except for the two special cases at 90 and 270 degrees. It’s true that it would save an average of around 3 scanlines for the whole chain (34 cycles per call) though, so if speed is of the essence, it might be worth it. |
|
|
Posted: Sun Feb 05, 2023 4:04 pm |
the k*sin(x) LUT can be used also for the k*cos(x) LUT the same way you already do for the sin(x) LUT, it's all about shifting the angle 90°... As for the 65 (8.8 fixed point) entries LUTs, yes they require 130 bytes each. They're k*sin(x) quarter-LUTs. |
|
|
Posted: Sun Feb 05, 2023 4:23 pm |
Sure, but k can still be two different things, wobble amount and pixel width of the segment, which would require one pre-baked LUT each. |
|
|
Posted: Mon Feb 06, 2023 8:36 am |
Oh, I thought it would remain constant, so that's why I suggested baking it into the LUT to avoid the (slow) multiplication. BTW in the end it depends on your situation. You might find that you can't afford so many CPU cycles and that some ROM would be empty anyway... ;) |
|
|
Posted: Mon Feb 06, 2023 10:56 am |
I agree that now that we have a general working solution, further optimizations depend much on the context it's used in.
Just for fun, I applied your idea partly and made a version with a 130 byte LUT pre-baked for an 8*sin(x) function (because the segment size is 8 pixels in this example). Thus, I could avoid two rounds of multiplications altogether and reduce the number of add hl, hl elsewhere by making sure my wobble amount is a multiple of the sprite size. This resulted in just 23 scanlines for calculating the whole tail. Pretty fast. |
|
|
Posted: Mon Feb 06, 2023 11:13 am |
I see, if your two constants are 8 and a multiple of it you are pre-multiplying your LUT and you avoid the multiplication with the first constant, while the second multiplication becomes shorter anyway as it's already saving three 16-bit left-shifts [add hl,hl]
not bad, not bad! :D |
|