6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sat Jun 01, 2024 12:31 pm

All times are UTC




Post new topic Reply to topic  [ 57 posts ]  Go to page Previous  1, 2, 3, 4  Next
Author Message
PostPosted: Tue Sep 06, 2022 4:16 pm 
Offline

Joined: Sat Apr 11, 2020 7:28 pm
Posts: 341
Now this:

Code:
AVR Instruction Set Summary
===========================
Several updates of the AVR CPU during its lifetime has resulted in different flavors of the instruction set, especially for the timing of the instructions. Machine code level of compatibility is intact for all CPU versions with a very few exceptions related to the Reduced Core (AVRrc), though not all instructions are included in the instruction set for all devices. The table below contains the major versions of the AVR 8-bit CPUs. In addition to the different versions, there are differences dependent of the size of the device memory map. Typically these differences are handled by a C/EC++ compiler, but users that are porting code should be aware that the code execution can vary slightly in number of clock cycles.

Name    Device    Description
        Series

AVR     AT90      Original instruction set from 1995.
AVRe    megaAVR®  Multiply (xMULxx), Move Word (MOVW), and enhanced Load Program Memory (LPM) added to the AVR instruction set. No timing differences.
AVRe    tinyAVR®  Multiply not included, but else equal to AVRe for megaAVR.
AVRxm   XMEGA®    Significantly different timing compared to AVR(e). The Read Modify Write (RMW) and DES encryption instructions are unique to this version.
AVRxt   (AVR)     AVR 2016 and onwards. This variant is based on AVRe and AVRxm. Closer related to AVRe, but with improved timing.
AVRrc   tinyAVR   The Reduced Core AVR CPU was developed for ultra-low pinout (6-pin) size constrained devices.
                  The AVRrc therefore only has a 16 registers register-file (R31-R16) and a limited instruction set.


Top
 Profile  
Reply with quote  
PostPosted: Tue Sep 06, 2022 4:38 pm 
Offline

Joined: Sat Apr 11, 2020 7:28 pm
Posts: 341
I think these two sections of the AVR Instruction Set Manual illustrate the concept of HAL I'm talking about.

The paragraph titled AVR Instruction Set Summary explains it all. It's the same machine code for all CPUs, but in several models certain features removed lead to certain instructions removed. The size of the included memory in the rest of the MCU does also makes a difference, and AFAIU the evolution on how internally the cores are designed and built resulted in the same instruction taking more or less clock cycles to complete in different models -- sort of what happens with the several 6502 implementations.

The AVR opcodes are all 16 bit, the CPUs being 8 bit. Seems not to be a problem even for the smallest MCU implementations.

Side note: from the eyes of a total ignorant, it seems not that hard to port 65xx code to AVR code.


Top
 Profile  
Reply with quote  
PostPosted: Tue Sep 06, 2022 5:12 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8196
Location: Midwestern USA
Okay, despite having read this topic end-to-end, I’ve failed to see its value.

In computing, HAL is “hardware abstraction layer,” which is a body of software. If the desire is to port an operating system to different computers that run the same microprocessor but have otherwise-incompatible architectures, a HAL could be used to efface the differences from the perspective of the operating system. If you need a HAL, you write one, otherwise you don't.

In my curmudgeonly opinion, the topic, “Discussion: What would be the point of a 65xx HAL?”, has no point.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Tue Sep 06, 2022 5:30 pm 
Offline

Joined: Sat Apr 11, 2020 7:28 pm
Posts: 341
About having several accumulators or more registers.

As GARTHWILSON said, having Zero and First page as registers would lead to not using the system bus at all for things happening there.

And I add that if all the zero page registers were accumulators, the software coded math operations would be much faster because the results of multiple additions could be reused directly instead of SToring them in memory and LoaDing them from memory for everyone of them, providing that there would be an instruction that would store the result of an math operation directly to a ZP location instead of the accumulator like it happens right now.


Top
 Profile  
Reply with quote  
PostPosted: Tue Sep 06, 2022 5:35 pm 
Offline

Joined: Sat Apr 11, 2020 7:28 pm
Posts: 341
BigDumbDinosaur wrote:
Okay, despite having read this topic end-to-end, I’ve failed to see its value.

In computing, HAL is “hardware abstraction layer,” which is a body of software. If the desire is to port an operating system to different computers that run the same microprocessor but have otherwise-incompatible architectures, a HAL could be used to efface the differences from the perspective of the operating system. If you need a HAL, you write one, otherwise you don't.

In my curmudgeonly opinion, the topic, “Discussion: What would be the point of a 65xx HAL?”, has no point.


If the definition of HAL as I've used it is wrong, what would be the right term to be used in this case?


Top
 Profile  
Reply with quote  
PostPosted: Tue Sep 06, 2022 5:38 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10811
Location: England
I've already mentioned upthread the virtual machine. Except you seem to be looking at something more like a configurable computing.

It's all a bit unsatisfactory because you haven't given a coherent top-to-bottom description. It's as if you have a sketch and you'd like others to fill in the details for you.

I think the thing about the AVR example is that it's easier to design a complex ISA and microarchitecture and then subset it, than to make ad-hoc additions to something existing.

That said, point accelerations like multiply are fairly easy to add and to make use of.


Top
 Profile  
Reply with quote  
PostPosted: Tue Sep 06, 2022 6:28 pm 
Offline

Joined: Sat Apr 11, 2020 7:28 pm
Posts: 341
BigEd wrote:
I've already mentioned upthread the virtual machine. Except you seem to be looking at something more like a configurable computing.


Is Configurable computing the right term for defining a CPU from the opcodes it could use?

BigEd wrote:
It's all a bit unsatisfactory because you haven't given a coherent top-to-bottom description. It's as if you have a sketch and you'd like others to fill in the details for you.


This thread is a discussion for exposition and interchange of ideas, comments and knowledge. It something arises from it, that someone with the appropiate knowledge could implement, it would be great.


Top
 Profile  
Reply with quote  
PostPosted: Tue Sep 06, 2022 8:02 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10811
Location: England
Possibly Reconfigurable Computing is a relevant term (instead of Configurable!)
https://en.wikipedia.org/wiki/Reconfigurable_computing


Top
 Profile  
Reply with quote  
PostPosted: Tue Sep 06, 2022 10:20 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8447
Location: Southern California
Sean wrote:
GARTHWILSON wrote:

I'm not familiar with a wide inventory of processors, but it is my understanding that a major push for lots of registers was partly to make it easier to write compilers.

My understanding is that having a large number of registers simplifies the writing of efficient register allocation/register spilling algorithms. Those algorithms are responsible for generating the machine code for moving variable values between registers and memory, and efficient versions of these algorithms make as few such moves as possible. This became far more important once the speed of memory could no longer keep up with processor speed.

See the comparison at https://www.westerndesigncenter.com/wdc ... risons.php .  At a given clock speed (and the same percentage of the maximum speed they were ever spec'ed for), the 68000 took longer to do a register-to-register byte move than the 65c02 took to do a memory-to-register byte move.  (The 65c02's advantage in that particular comparison is lost on the 16- and 32-bit moves, but that's because of the 68000's wider external data bus.)

Was the RAM speed really such a factor?  According to this table, there were 18ns SRAMs by Oct '81, 10ns by Dec '87, and 2.5ns by Dec '94.

Perhaps they were just buying insurance for the future in the designs of the early processors that had a lot of registers.  I need to find it again, but what I read about the 68000 design was that the designers went to compiler writers and asked them what they would like, which is where the many registers came from, although the 68000 itself never ran at a speed that was any challenge to the access speeds of available memory.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Tue Sep 06, 2022 10:20 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8447
Location: Southern California
tokafondo wrote:
About having several accumulators or more registers.

As GARTHWILSON said, having Zero and First page as registers would lead to not using the system bus at all for things happening there.

Yes; but I also said, there would be the effect that "OTOH, certain access techniques might be forfeited unless again the op-code set were expanded, again requiring two-byte op codes," which would wipe out much or all of the benefit. Instead of spending the extra cycle in a ZP access, the extra cycle would be spent on another op-code byte. After thinking about it more (but maybe I'm forgetting something I had considered earlier), now I think the same op codes could be used, but the instruction decoding inside the processor would get more complex (read: slower); regardless, some situations might require being aware of the differing numbers of cycles.

Quote:
And I add that if all the zero page registers were accumulators, the software coded math operations would be much faster because the results of multiple additions could be reused directly instead of SToring them in memory and LoaDing them from memory for everyone of them, providing that there would be an instruction that would store the result of a math operation directly to a ZP location instead of the accumulator like it happens right now.

Math-intensive applications may benefit; but my own control applications are not math-intensive. I have kind of lamented that the SMB, RMB, BBS, and BBR instructions are ZP-only. They're particularly useful for when I/O is in ZP, but that's seldom the case since having a lot of I/O would consume a lot of valuable ZP space. Actually, if I/O were in ZP and ZP had its own bus, I/O would have to be onboard the same IC. Microcontrollers have this of course, along with onboard memory, timers, and other processor support. The alternative would be to dedicate such extra pins to bring out another bus.

I think part of the solution for accommodating this is a wider data bus, and merging operands with the op code in an instruction word. There's nothing wrong with that of course; it's just that it would no longer be a 65xx processor.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Tue Sep 06, 2022 10:21 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8447
Location: Southern California
BigEd wrote:
I've already mentioned upthread the virtual machine. Except you seem to be looking at something more like a configurable computing.

It's all a bit unsatisfactory because you haven't given a coherent top-to-bottom description. It's as if you have a sketch and you'd like others to fill in the details for you.

Yes; I'm still trying to figure it out, suspecting that whatever good idea it leads to is still future.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Wed Sep 07, 2022 1:41 am 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8196
Location: Midwestern USA
tokafondo wrote:
If the definition of HAL as I've used it is wrong, what would be the right term to be used in this case?

I can't answer that, as I haven't a clue as to what it is you are trying to define. The bulk of this discussion has to do with hardware attributes that are only tangentially related to a HAL.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Wed Sep 07, 2022 6:39 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10811
Location: England
> At a given clock speed (and the same percentage of the maximum speed they were ever spec'ed for), the 68000 took longer to do a register-to-register byte move than the 65c02 took to do a memory-to-register byte move.

You need to scale this by the memory access being multiple cycles on a 68k. "At a given clock speed" makes no sense as a comparison of such different architectures.

> the same percentage of the maximum speed they were ever spec'ed for
Although that is an interesting qualification, it brings in questions of implementation technology choices and market forces. It is possible that the internal complexity of the 68k limited the maximum clock speed enough for this to be a limiting factor. It's another reason to keep the complexity of CPUs under control. A massive microcode ROM (and nanocode ROM) is in a sense a simple thing, not a complex one, but by being large it might be slow, on chip.

Adventures on anycpu.org show that minor changes to a machine spec or to implementation tactics can reduce the clock speed on FPGA from 60MHz to 50MHz. That's a big step-down in performance, and might negate the advantage of extra accumulators.


Top
 Profile  
Reply with quote  
PostPosted: Wed Sep 07, 2022 6:50 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10811
Location: England
Parenthetically, I think this thread shows
- the merit in a good choice of title, which doesn't distract from the point in question
- a clear first post which sets up the actual conditions and the question being asked
- the difficulty of speculating about improved CPUs when not informed by considerations of the tooling, the microarchitecture, the instruction encoding, the influences on clock speed
- the improbability of bringing a napkin sketch to a forum and ending up with someone skilled having been inspired into building the thing

I don't wish to put anyone off, and I have nothing against wild ideas and people enjoying speculation, but making a working system is a bit more difficult.

If you really want to make an improved 6502 I might suggest this path
- become pretty good at programming the 6502
- gain a passing familiarity with two or three other CPUs, at machine code level
- understand what it is that an assembler does, and how instruction encodings work
- design and simulate a very simple accumulator machine
- add indexing to it
- design and simulate other simple machines
- implement at least one machine and study what resources are needed - busses, flops, logic, registers, adders, incrementers
- understand the reports you get from timing analysis
- make changes and investigate the consequences

There are many books to guide you on this journey. Recent books include Petzold's Code, and Nand to Tetris.

There are a few YouTubers who build CPUs from scratch and explain what they are doing. In fact I think Ben Eater started with a CPU build, before making a minimal 6502 system.

Tokafonda, in the light of what you've learned from this thread, it might be worth starting a new one. A good title and a strong first post should set it off in the right direction, although it's an impossible hope for a thread to stay on-topic for long.


Top
 Profile  
Reply with quote  
PostPosted: Wed Sep 07, 2022 8:06 am 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8447
Location: Southern California
BigEd wrote:
Quote:
At a given clock speed (and the same percentage of the maximum speed they were ever spec'ed for), the 68000 took longer to do a register-to-register byte move than the 65c02 took to do a memory-to-register byte move.

You need to scale this by the memory access being multiple cycles on a 68k. "At a given clock speed" makes no sense as a comparison of such different architectures.

The 68K's need for multiple cycles for a memory access shouldn't have anything to do with the speed at which it transfers a byte from one internal register to another, except that it takes multiple cycles just to read the instruction. In spite of the fact that the '02 requires faster memory to operate at the same number of MHz, it still did not challenge the speeds of RAM when it became available spec'ed for 8MHz. However, since plasmo has demonstrated that the W65C02S can run at approximately three times the rated speed, maybe the rated speeds have been held artificially low partly because of the speeds of available memory and glue logic. Bill Mensch did have NMOS 6502's running at 10MHz in the late 1970's, although none were sold with a 10MHz rating, or even a 5MHz rating.

I'm not down on the 68K—in fact, I hold the family in higher regard than I do certain other processors, some of which became popular—I'm only trying to encourage taking everything into consideration, which includes:
Quote:
implementation technology choices and market forces. It is possible that the internal complexity of the 68k limited the maximum clock speed enough for this to be a limiting factor. It's another reason to keep the complexity of CPUs under control. A massive microcode ROM (and nanocode ROM) is in a sense a simple thing, not a complex one, but by being large it might be slow, on chip.

Adventures on anycpu.org show that minor changes to a machine spec or to implementation tactics can reduce the clock speed on FPGA from 60MHz to 50MHz. That's a big step-down in performance, and might negate the advantage of extra accumulators.

The sirens call; but giving in won't be without its hidden penalties.

See viewtopic.php?p=8549#p8549 about a book on processor design.


Quote:
I don't wish to put anyone off, and I have nothing against wild ideas and people enjoying speculation,

Sometimes even the craziest ones can lead to an unexpected good idea as the discussion progresses. :D

Quote:
tokafondo, in the light of what you've learned from this thread, it might be worth starting a new one. A good title and a strong first post should set it off in the right direction

Note that you can also edit your head post and refine the topic title to clarify where you want to go with it.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 57 posts ]  Go to page Previous  1, 2, 3, 4  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 8 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: