Skip to content

agsb/milliForth-RiscV

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

415 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

milliForth-RiscV

"To master riding bicycles you have do ride bicycles"

started at 23/07/2025, agsb@

first version at 12/10/2025, @agsb

minimal dictionary compiled words at 04/12/2025, @agsb

Please, vide Changes and Notes

Any Forth system depends on the I/O functions and the executable linkable format (ELF) of the host system.

The problem is reach a functional minimal code forth engine for RISCV ISA.

This is an implementation of MilliForth (sector-forth) concept for RISCV ISA, using Minimal Indirect Thread Code.

Milliforth uses a minimal set of functions and primitives for make a Forth.

This version with minimal code (.text), uses only 454 bytes, 388 bytes for Forth engine and 66 bytes for linux system I/O. Not counting ELF headers. Used 56 bytes to load ELF PIC address and 44 bytes for word headers.

No human WORDS. It uses DJB2 hash in headers.

No Terminal Input Buffer, just an token-to-hash stream ascii parser.

Only use a IMMEDIATE flag, at MSBit (31) of hash, it also is NaN, used to indicate errors.

There are a file with more core words in native code to use.

For Size

How shink to a minimal compiled size in a Risc-V ?

1. do not need align, the size of opcodes is always 2 or 4 bytes;

2. choose registers to maximize use of compressed riscv opcodes;

3. warn the user about possible errors but abandon error checking;

4. use streams, no buffers;

5. do not speculate;

For use

The sector-riscv.S is working, also the extra-milliforth.S,
could test by:

**cat t0.f t1.f t2.f - | sh doit.sh | tee output**

t0.f is a minimal set of words, same as test0-riscv.f;

t1.f is a complement with hash and more words; 

the hiphen refers to terminal (/dev/tty)

Could test by:

cat t0.f | sh doit.sh | tee z1

cat t0.f t1.f | sh doit.sh | tee z2

t1.f includes <builds create variable constant does> (_STUB_)

PS. 

Add a hyphen at end of cat files list to allow terminal I/O
        cat t0.f t1.f t2.f - > sh doit.sh 
        
Some esoteric bug makes the first word to have hash error.

The memory management is done by extend the dictionary 
    into .bss, by reserve .skip bytes, defaults to 64k * 4
    no linux calls for memory allocation. (Anyone ?)

The source could be compiled with 'missed' hack and
    more extensive native code word set.

No name, just a hash

"WE STI.. DON.. SEE THE NEE. FOR 31 CHA...... NAM.. IN THE GEN.... CAS."

Full text

The letter to the Editor of Forth Dimensions [Moore 1983] concerning the practice of storing names of Forth words as a count and first three characters,

A count and first three characters, four bytes was enough.

"AI uses hash code as word, Humans uses semantics as word" Liang Ng

In this century, computers uses hashes to compare contents, so why not use a 4 bytes hash to identify tokens ?

This version of milliforth uses 32-bit DJB2 hash. It provide a fast comparation in compilations and have small footprint.

For a 32-bit DJB2 hash, collisions become highly probable after approximately 65,536 items which requires a damn huge dictionary.

No Terminal Input Buffer

"The spice must flow"

Chuck executes or compiles each word individually rather than line by line. In fact Chuck doesn't really have lines. I will also go word by word rather than line by line in aha.

Jeff Fox ?

Why no Terminal Input Buffer ?

Forth is not a editor. Does not need of undo, redo, copy or paste.

The input is a stream, just flows tokens.

A token is being defined, has been defined, or has not been defined and Forth reacts.

Internals

This version uses DJB2 hash for dictionary entries, uses relatives branches and includes:

minimal primitives:

    u@    return the address of user structure
    0#    if top of data stack is not zero returns -1 (0xFFFFFFFF)

    +     adds two values at top of data stack
    NAND  logic not-and the two values at top of data stack
    
    @     fetch a value of cell wich address at top of data stack
    !     store a value into a cell wich address at top of data stack

    :     starts compiling a new word
    ;     stops compiling a new word
    
    EXIT  ends a word

    KEY  get a char from default terminal (stdin)
    EMIT  put a char into default terminal (stdout)
        
only internals: 
    
    main, cold, warm, 
    miss, abort, quit, warp,
    token, skip, hash, scan, mask, 
    find, eval, compile, execute,   
    unnest, next, pick, jump, nest, move
    comma, _init, _getc, _putc, _exit

    ps. next is not the NEXT of FOR NEXT loop !    

with externals, ecall to linux:

    _getc, _putc, _exit, ( _fcntl, _init ) 

Vocabulary

More words in native code are selectable in defines.S

Eg. extras:

;$      execute native code at instruction pointer (IP), vide Notes
NAN     place 0x80000000 into stack
LSHIFT  shift left a value by n bits
RSHIFT  shift right a value by n bits
ABORT   restart the Forth interpreter
BYE     ends the Forth, return to system
.       show the cell at top of data stack in hexadecimal 
$       next token is a signed integer hexadecimal number to TOS 

A full list of primitives in Word Lists

the Language

For Forth language primer see Starting Forth

For Forth from inside howto see JonasForth

For A Problem Oriented Language see POL

References

About

Port of milliForth to RiscV ISA in 454 bytes (without ELF header)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors