FiM++ Wiki
Advertisement

Kyli Rouge

.JS file

Compile to generic JavaScript, runnable by anything that can interpret it

  1. Whitespace is reduced as much as possible (replace /[\s^\r^\n]+/gim with a single space)
    • Whitespace is replaced with underscores in class, function, and variable names, instead opting for camel-case (Hello World becomes Hello_World)
    • Underscores are replaced with two underscores in class, function, and variable names, instead opting for camel-case (Hello_World becomes Hello__World)
  2. Punctuation is replaced with ; (replace /[!,\.:\?…‽]+/gim with ;)
  3. Keywords are replaced with JavaScript keywords and functions
    • Binary prefix operators which have a partner infix operator (think add 12 and 2) are converted to a single infix operator (think 12+2)
    • Class declarations are stripped of their inheritance and their ending ; is replaced with a {. The class inheritance is placed after class by overriding prototype.
      • Dear Princess Celestia:Hello World; becomes function Hello_World{}Hello_World.prototype=new Princess_Celestia();
  4. Literals are translated to JavaScript equivalent
    • Fancy quotes replaced with plain ones (“Hello, World!” becomes "Hello, World!")
    • Booleans are replaced with true and false
    • Replace "'"'" escapes with \"
      • /(["”“]['‘’]){2}["”“]/gim becomes \"
      • /["”“]['‘’]["”“]['‘’](?=\s*[!,\.:\?…‽])/gim becomes \""
      • /\s['‘’]["”“]['‘’]["”“]/gim becomes "\"
    • Replace in-string newlines with \r\n
    • Writer name added at top as a documentation comment of the format
      /**
       * This report, entitled “Hello World”, was written by
       * Kyli Rouge on the 21st of January, 2014
       */
      
  5. If compiler is told to do so, spacing and formatting are added to make it readable
    • Original source added as a comment to the top of the program


JavaScript representations of Phrases

  1. To do

Example

Hello World.FPP (190B)

Dear Princess Celestia:Hello World!

Today I learned how to say hello world!
I said “Hello, World!”!
That's all about how to say hello world.

Your faithful student, Kyli Rouge.

Would compile into one of the following 3 files, depending on the compiler options specified:

Default

Hello World.JS (416B)

Princess_Celestia = function(){};

/**
 * This report, entitled “Hello World”, was written by
 * Kyli Rouge on the 21st of January, 2014
 */
function Hello_World()
{
	this.today = function()
	{
		this.how_to_say_hello_world();
	};
	
	this.how_to_say_hello_world = function()
	{
		console.log("Hello, World!");
	};
}
Hello_World.prototype = new Princess_Celestia();

new Hello_World().today();

Long

Compiler told to keep source

Hello World.long.JS (641B)

/*
(Original FiM++ Source)
Dear Princess Celestia: Hello World!

Today I learned how to say hello world!
I said “Hello, World!”!
That's all about how to say hello world.

Your faithful student, Kyli Rouge.
*/

Princess_Celestia = function(){};

/**
 * This report, entitled “Hello World”, was written by
 * Kyli Rouge on the 21st of January, 2014
 */
function Hello_World()
{
	this.today = function()
	{
		this.how_to_say_hello_world();
	};
	
	this.how_to_say_hello_world = function()
	{
		console.log("Hello, World!");
	};
}
Hello_World.prototype = new Princess_Celestia();

new Hello_World().today();


Min

Compiler told to minify Hello World.min.JS (247B)

Princess_Celestia=function(){};function Hello_World(){this.today=function(){this.how_to_say_hello_world()};this.how_to_say_hello_world=function(){console.log("Hello, World!")}}Hello_World.prototype=new Princess_Celestia();new Hello_World().today()

99 Jugs of Cider

Long

99 Jugs of Cider.long.JS (2941B, ~2.87KiB)

/*
Remember when I wrote about Applejack? (I don't!)

Dear Princess Cadence and Shining Armor: Cider Jugs.

Today I learned Applejack's Drinking Song.
Did you know that Applejack likes the number 99? (Applejack likes a lot of things...)
I remembered how to sing the drinking song using Applejack.
That's all about Applejack's Drinking Song!



I learned how to sing the drinking song using the number ciders.
As long as ciders were more than 1:
I sang ciders" jugs of cider on the wall, "ciders" jugs of cider,".
There was one less ciders.
When ciders had more than 1,
I sang "Take one down and pass it around, "ciders" jugs of cider on the wall."!
Otherwise,
I sang "Take one down and pass it around, 1 jug of cider on the wall."!
That's what I would do,
That's what I did.

I sang "1 jug of cider on the wall, 1 jug of cider.
Take it down and pass it around, no more jugs of cider on the wall.

No more jugs of cider on the wall, no more jugs of cider.
Go to the celler, get some more, 99 jugs of cider on the wall.".
That's all about how to sing the Drinking Song!

Your faithful student, Twilight Sparkle.

P.S. Twilight's drunken state truely frightened me, so I couldn't disregard her order to send you this letter. Who would have thought her first reaction to hard cider would be this... explosive? I need your advice, your help, everything, on how to deal with her drunk... self. -Spike
*/
function Applejack() {} /*I don't!*/
function Princess_Cadence() {}
function Shining_Armor() {}

//Dear Princess Cadence and Shining Armor: Cider Jugs.
Cider_Jugs = function ()
{
	this.today = function()
	{
		this.Applejacks_Drinking_Song();
	};
	
	this.Applejacks_Drinking_Song = function()
	{
		var Applejack = 99; /*Applejack likes a lot of things...*/
		this.how_to_sing_the_drinking_song(Applejack);
	};
	
	this.how_to_sing_the_drinking_song = function(ciders)
	{
		while(ciders > 1)
		{
			console.log(ciders + " jugs of cider on the wall, " + ciders + " jugs of cider,");
			--ciders;
			if(ciders > 1)
			{
				console.log("Take one down and pass it around, " + ciders + " jugs of cider on the wall.");
			}
			else
			{
				console.log("Take one down and pass it around, 1 jug of cider on the wall.");
			}
		}
		
		console.log("1 jug of cider on the wall, 1 jug of cider.\r\nTake it down and pass it around, no more jugs of cider on the wall.\r\nNo more jugs of cider on the wall, no more jugs of cider.\r\nGo to the celler, get some more, 99 jugs of cider on the wall.");
	};
};

Cider_Jugs.prototype = new Princess_Cadence();
new Cider_Jugs().today();

// Twilight's drunken state truely frightened me, so I couldn't disregard her order to send you this letter. Who would have thought her first reaction to hard cider would be this... explosive? I need your advice, your help, everything, on how to deal with her drunk... self. -Spike



Interpretation steps

  1. Replace all actual newlines (U+000A, U+000D, or U+000A U+000D) in string literals with \r\n

Feedback

Pros

  • Easy to decompile
  • Easy to manually compile
  • Easy to run (in any browser)

Cons

  • Slower than compiled to assembly
  • Larger output file than assembly

.FR file

A .FR file (abbreviation for "Friendship Report") is proprietary FiM++ bytecode and must be read and executed by a virtual machine.

  • Whitespace and Comments are entirely removed (except for programmer name)
  • Keywords are represented by Unicode characters starting at U+0001, which represent their function, not the actual used keyword (any two synonyms are compiled to the same character).
    • Binary prefix operators which have a partner infix operator (think add 12 and 2) are converted to a single infix operator (think 12+2)
  • Variables, class names, and method names are compiled into hex digits
    • surrounded by the Unicode character
  • Literals are kept as-is, with any source quotes removed
    • Booleans are preceded by
    • Numbers are preceded by
    • Characters are preceded by
    • Strings are surrounded by
    • Replace any instance of in character literals or in String literals with U+0000. This is a weak point, as it leaves two characters being represented the same way.
  • Punctuation is compiled into 

Unicode representations of Phrases

  1. (U+0000): CLASS
  2. (U+0001): END_CLASS
  3. (U+0002): IMPORT
  4. (U+0003): IMPLEMENTS
  5. (U+0004): METHOD
  6. (U+0005): MANE_METHOD
  7. (U+0006): END_METHOD
  8. (U+0007): RETURN_TYPE
  9. (U+0008): PARAMETERS
  10. (U+0009): RETURN
  11. (U+000A): REACALL
  12. (U+000B): VARIABLE
  13. (U+000C): BOOL
  14. (U+000D): BOOL_ARRAY
  15. (U+000E): CHARACTER
  16. (U+000F): CHARACTER_ARRAY
  17. (U+0010): CHARACTER_ARRAY_ARRAY
  18. (U+0011): NUMBER
  19. (U+0012): NUMBER_ARRAY
  20. (U+0013): ASSIGN
  21. (U+0014): ASSIGN_CONSTANT
  22. (U+0015): REASSIGN
  23. (U+0016): IF
  24. (U+0017): IF_PARTNER_SUF
  25. (U+0018): END_IF
  26. (U+0019): ELSE
  27. (U+001A): END_ELSE
  28. (U+001B): SWITCH
  29. (U+001C): CASE
  30. (U+001D): CASE_PARTNER_POST
  31. (U+001E): DEFAULT
  32. (U+001F): WHILE
  33. ( ): END_WHILE
  34. (!): DO_WHILE
  35. ("): END_DO_WHILE
  36. (#): PRINT
  37. ($): PROMPT
  38. (%): READ
  39. (&): ADD_IN
  40. ('): ADD_PRE
  41. ((): ADD_PRE_ARTNER_IN
  42. ()): DIVIDE_IN
  43. (*): DIVIDE_PRE
  44. (+): DIVIDE_PRE_PARTNER_IN
  45. (,): MULTIPLY_IN
  46. (-): MULTIPLY_PRE
  47. (.): MULTIPLY_PRE_PARTNER_IN
  48. (/): SUBTRACT_IN
  49. (0): SUBTRACT_PRE
  50. (1): SUBTRACT_PRE_PARTNER_IN
  51. (2): DECREMENT
  52. (3): INCREMENT
  53. (4): AND
  54. (5): OR
  55. (6): XOR
  56. (7): XOR_PARTNER_IN
  57. (8): NOT
  58. (9): EQUAL
  59. (:): NOT_EQUAL
  60. (;): GREATER_THAN
  61. (<): GREATER_THAN_OR_EQUAL
  62. (=): LESS_THAN
  63. (>): LESS_THAN_OR_EQUAL
  64. (?): NOTHING
  65. (@): TRUE
  66. (A): FALSE

Example

Hello World.FPP (190B)

Dear Princess Celestia:Hello World!

Today I learned how to say hello world!
I said “Hello, World!”!
That's all about how to say hello world.

Your faithful student, Kyli Rouge.

Would compile into:

Hello World.FR (93B (51% compression)); click to view, as Wikia won't let special characters on the site

Interpretation steps

Hello World.FPP (190B)

  1. Read in original code:
    • Dear Princess Celestia:Hello World!

      Today I learned how to say hello world!
      I said “Hello, World!”!
      That's all about how to say hello world.

      Your faithful student, Kyli Rouge.
  2. Remove comments (except the special programmer name comment) and unnecessary whitespace:
    • Dear Princess Celestia:Hello World!Today I learned how to say hello world!I said “Hello, World!”!That's all about how to say hello world.Your faithful student, Kyli Rouge.
  3. Replace phrases and punctuation with generics:
    • CLASS Princess Celestia;Hello World;MAIN_METHOD how to say hello world;PRINT “Hello, World!”;END_METHOD how to say hello world;END_CLASS; Kyli Rouge;
  4. Surround literals with special Unicode characters, removing quotes:
    • CLASS Princess Celestia;Hello World;MAIN_METHOD how to say hello world;PRINT Hello, World!;END_METHOD how to say hello world;END_CLASS; Kyli Rouge;
  5. Replace class, method, and variable names with numbers:
    • CLASS 0;1;MAIN_METHOD 2;PRINT Hello, World!;END_METHOD 2;END_CLASS; Kyli Rouge;
  6. Remove remaining whitespace:
    • CLASS0;1;MAIN_METHOD2;PRINTHello, World!;END_METHOD2;END_CLASS; Kyli Rouge;
  7. Replace generic phrases and punctuations with special Unicode characters:
    • U+00000&#xFFFF;1&#xFFFF;U+00052&#xFFFF;#Hello, World!&#xFFFF;U+00062&#xFFFF;U+0001&#xFFFF; Kyli Rouge&#xFFFF;

Hello World.FR (93B)

Feedback

Pros

  • Easy to decompile
  • Easy to manually compile
  • Easier to read than my pseudo-assembly :)

Cons

  • Proprietary; requires a virtual machine
  • Some redundancies with surrounding characters
  • Maybe it's too simple to manually decompile, and could not hold the levels of "security" proprietary bytecode needs.
  • It's unnaturally difficult for a program to read, and I personally think it's not bytecode but a crunched compression. Just look at the Java bytecode! It's totally another thing in respect to the source!
  • Some statements are redundant. Why you have infix and prefix operators when you can base your code entirely on the former?
    • Good point. I'll work on it :3

UNiTY (Mattia Borgo)

I was thinking of a more easy to execute bytecode, much like a VM.

Format: .fb and .fba files

.fb files (FiM++ bytecode) is another language by itself, put under FiM++ and used in a VM.

.fba files (FiM++ bytecode archive) are libraries of classes.

Format Specification

.fb

The first character of a .fb file is always ú (0xfb).

After it, the author's name, in a Pascal-like manner (lentgh of string (byte), then string).

Then the compiled bytecode.

Instructions

Code Instruction Arguments Meaning
0x01 NEWC depends* Push a new class slot.
0x02 MANC - Reserve the mane class slot.
0x10 NEWM 1 byte, 1 byte, depends*** Declare a new method in the selected class slot.
0x11 MANM - Declare a mane method in the selected class slot.
0x12 CALL 1 word, 1 word, depends** Call a method of a class and discard result.
0x13 RETM depends** Return a value.
0x20 NEWV - Push a new variable.
0x21 ASSV 1 dword, depends** Assign a value to the selected variable.
0x23 ASSC 1 dword, depends** Assign a constant value to the selected variable.
0x40 WHEN depends** If statement.
0x42 ELSE - Else statement.
0x50 SWTC depends** Switch statement.
0x51 CASE depends** Case statement.
0x61 WHLE depends** While statement.
0x62 WHDO depends** Do while statement.
0x63 WEND - End of (do) while statement.
0x80 PRNT depends** Print statement.
0x81 INPT 1 dword, depends** Input statement.
0xa0 INCR 1 dword Increment statement.
0xa1 DECR 1 dword Decrement statement.
0xf0 PORT 1 byte Runtime-load a specific builtin module

v0.01 -> Interfaces aren't needed, since they can be applicated at compilation time.

*

Class declaration has a different type of instruction: in the list above its argument is marked with a * (single asterisk).

To make things clear, I'll start with an example.

Dear Princess Luna and Shining Armor and Cadence: An Update:

Dear directly translates into NEWC, which needs a word to define its superclass, than a byte specifying the number of interfaces and their respective words.

We'll assume Shining Armor and Cadence have interface numbers 1 and 2, respectively; Princess Luna will be class slot 1.

Coding by-hand the example gives

NEWC 1,2:1,2

This translates in the following bytes:

0x01 0x00 0x01 0x02 0x00 0x01 0x00 0x02

**

Many instruction have a a ** (double asterisk) next to their arguments.

It essentially means that the argument is a runtime-calculated value.

It follows the syntax described below (Values and Operators).

Example:

Did you know that Spike’s age is the number 10?

This translates to:

NEWV
ASSV 0,n:10

***

Method declaration has a different type of instruction: in the list above its argument is marked with a *** (triple asterisk).

To make things clear, I'll start with an example.

I learned how to take the sum of a set of numbers with a number using the numbers X.

I learned directly translates into NEWM, which needs a byte defining the return value for the method and a byte specifying the number of arguments in input plus 1.

All arguments' types are specified.

Coding by-hand the example gives

NEWM n,0:n*

This translates in the following bytes:

0x10 0x4f 0x00 0x00 0x5f

Values and Operators

Operator Low nibble Meaning
% 0x0 Explicit end of value or placeholder
+ 0x1 Add
* 0x2 Multiply
- 0x3 Subtract
/ 0x4 Divide
& 0x5 And
^ 0x6 Exclusive or
| 0x7 Or
! 0x8 Not
> 0x9 Greater than
>= 0xa Greater than or equal
< 0xb Less than
=< 0xc Less than or equal
= 0xd Equal
!= 0xe Not equal
$ 0xf

Explicit start of value

Type High nibble Meaning
_ 0x0 No type (nothing)
b: 0x1 Boolean (simple)
c: 0x2 Character (simple)
b*: 0x3 Boolean (array)
n: 0x4 Number (simple)
n*: 0x5 Number (array)
c*: 0x6 Character (array)
c**: 0x7 Character (2-array)
v: 0x8 Variable
f: 0x9-0xf Method call with masked type
  • Numbers: 64-bit floating point values according to the IEEE 754 specs.
  • Booleans: either 0x00 or 0xff, FALSE and TRUE, respectively.
  • Characters: 1 Unicode character.
  • Variables: 1 double word pointer.
  • Methods: 1 word pointer to class, 1 word pointer to method, arguments.
  • Arrays in a Pascal-like manner.

Examples

Hello World program:

Dear Princess Celestia:Hello World!

Today I learned how to say hello world!
I said “Hello, World!”!
That's all about how to say hello world.

Your faithful student, Kyli Rouge.

Compiled language:

> author Kyli Rouge
MANC
MANM
PRNT c*:"Hello, World!"
ENDM
ENDC

Compiled bytecode:

0xfb 0x0a 0x4b 0x79 0x6c 0x69 0x20 0x52 0x6f 0x75 0x67 0x65 0x02 0x11 0x80 0x60 0x00 0x0d 0x48 0x65 0x6c 0x6c 0x6f 0x2c 0x20 0x57 0x6f 0x72 0x6c 0x64 0x21 0x14 0x04

or, in one dump:

fb0a4b796c6920526f75676502118060000d48656c6c6f2c20576f726c64211404

The compiled bytecode's lentgh is 33B, with a compression ratio of 81%.

Feedback

Pros

  • Seems like it'd easily compile to BASIC or Assembler, making it astoundingly fast.

Cons

  • Numbers should be IEEE 754 64-bit floating-point values, and this seems to only support 16-bit values.
    • Were those FP values? Oh man. Anyway, I never said numbers would take 16 bit values.
      • I know, but I don't see any 64-bit support, here...
        • Now there is! :3
          • No, there's not. Everything's still in 16-bit format, except Booleans, which you just made 32-bits...
            • Well... numbers take eight bytes like a C double. I thought it was obvious, since you can't allocate 64 bits on a single memory cell. By the way, instruction format, as far as I can tell, is 8 bit format.
  • Booleans take up 16 bits, rather than the ideal 1
    • Done.
      • If by "Done", you mean "Increased to 32 bits rather than decreased to 1 byte", then yes :I
        • Where you get a 32 bit count is unknown to me, at most these are 16 bits (0xffff) or 24 with the operator. Anyway, I read your advice backwards, and I thought the original implementation was 16 bit long booleans rather than my original byte (and if with 16 bits you meant the operator AND the value, well I couldn't find another way).
  • You use > as both a comment starter and operator
    • Actually, that was only a way of saying who the author is, as there are no comments in a compiled source.

Alxg833 (Alex Gould)

I wouldn't really trust myself with full bytecode editing yet, so I'm just going to work on translating files into Java and C++. A hypothetical compiler would have settings to create one or both.

Format: .fppj and .fppc files

A .fppj file would just be a standard Java file, with a signature at the beginning to denote the translation of said file back into FiM++. (Instead of just displaying it in the FiM++ editor as a Java program.) Descriptions of which specific commands the programmer used (said vs. sang, etc.) would be contained in unicode sequences written in comments at the end of each line. (The user would never actually see the Java code, and would therefore not notice the extraneous commenting, which would be hidden in the editor, possibly through use of another unicode character to mark the comment as compiler-made.) This would be fairly easy to do, as the structure of the two languages are somewhat similar. This would also allow faster compilation into .class files, and simpler interaction with Java's class library. The downside of course is that you'd be working through Java on a VM, which could be fairly slow to rewrite and compile.

A .fppc file would be similar (albeit in C++), however the code would have to be deconstructed a bit more in order to work in the same way as a .fppj file. (I was thinking of using UNiTY's instruction scheme, which should aid the translation to C++.) After that, compilation would be faster than in Java. However, C++ is a more exacting language, and I see the probability for translation errors as very high for .fppc files, much higher than their Java counterparts.

Feedback

Pros

.fppj

  • Works on any OS
  • Less translation between languages
  • Easy to decompile
  • Easy to read in a text-editor

.fppc

Cons

.fppj

.fppc

  • Only works on one OS at a time
  • Ambiguous; are you trying to make a C++ source file out of it or an assembly file?
    • I was thinking a C++ source file. It would look coded kind of strangeley, but I think it would be doable. As per the instruction scheme, I was just thinking of simplifying the code out into syntax like NEWC, MANC, etc, before translating to C++.
      • What about pointer? There is no point of using C++ if there is no pointer, and currently FiM++ is not supporting pointer

User

I propose to use AVM2 as a virtual machine for FiM.

Feedback

Pros

  • We don't need to write a virtual machine.
  • We don't need to invent a new file format.
  • It is hard to decompile the code back into the human-readable form.
  • The code will run everywhere because Tamarin is cross-platform.

Cons

  • We will have to write a FiM++ to ABC bytecode compiler. It is not going to be easy because it will have to be a real compiler, not a simple translator. You cannot replace tokens with 1-symbol identifiers to get an ABC bytecode.
  • Not invented here.
Advertisement