Inf2A 2018–19: Assignment 1The Language Processing Pipeline for Micro-HaskellIssued 12 October 2018The deadline for this assignment is 4pm, Tuesday 30 October 2018 (the night beforeHallowe’en).OverviewThe objective of this practical is to illustrate the language processing pipeline in thecase of a small but interesting fragment of the Haskell programming language, which weshall call Micro-Haskell (or MH). The practical is self-contained, and prior knowledge ofHaskell itself is not assumed; however, those with no previous experience of Haskell mayneed to invest a little more time to understand what is required. The brief introductionto Micro-Haskell to be given in Lecture 14 may also be helpful.The practical illustrates all stages of the language processing pipeline for programminglanguages, taking us from a source file, written in MH, to execution of the program.In our case, the pipeline has four stages: lexing, parsing, typechecking and evaluation.Your task is to provide language-specific material to assist with the first three stages:lexing, covered in Part A of the practical; parsing, covered in Part B; and typechecking,covered in Part C. A simple evaluator is provided for you.The code implementing all four stages will itself be written in Java. (So you arewarned at the outset that you’ll need to think in two languages at once!) Severalgeneral-purpose components are provided; your task is to supply the parts specific toMH, by applying your understanding of the course material. Once you have finished,you’ll be able to hook up your implementations to the given evaluator to obtain acomplete working implementation of MH.Of course, many libraries of lexing and parsing tools in Java are available on theInternet, but the point of this practical is to build things up from first principles inorder to understand how they work. You should therefore not attempt to use anylexer or parser utilities you may find online, nor any tools such as StringTokenizer orStreamTokenizer that might otherwise appear tempting.1Besides writing your own code, you are advised to spend some time understandingwhat the provided parts of the code are doing, at least in broad terms. This will helpyou to understand what you need to implement, and also to see how the various stagesof the pipeline all fit together.The version of Java for this practical is 1.8.0 181 (the version currently installed onthe DICE machines). If you have another version installed on a machine of your own,you may still be able to complete the practical with this, but it’s your responsibilityto check that the final version of your code compiles and runs under the above versionbefore you submit it.InstructionsTo begin, download the code file Inf2A_Prac1_Files.zip from the Informatics 2AAssignments webpage. On a DICE machine, this can be unpacked usingunzip Inf2A_Prac1_Files.zipThis will build a subdirectory Inf2A_Prac1_Files inside your working directory (folder),within which you will find all source files needed for the assignment. Look first at thefile MH_example.txt, which provides a small sample of Micro-Haskell.The assignment comprises four parts: A–D. In each of Parts A–C, you have tocomplete a template Java file that is provided for you. In doing this, fill in only theparts of code that you are instructed to write. Do not make any other alterationsto the existing code — this will invalidate your solution! You are required tosubmit your solutions from a DICE machine, using the submit command, as specifiedbelow.Part Submit command MarksA submit inf2a cw1 MH_Lexer.java 35B submit inf2a cw1 MH_Parser.java 30C submit inf2a cw1 MH_Typechecker.java 25D no further submission required 10It is also possible to submit your solutions to parts A–C simultaneously:submit inf2a cw1 MH_Lexer.java MH_Parser.java MH_Typechecker.javaPart D combines Parts A–C with the evaluator to create a full working implementationof MH; it does not require any submission of its own.2Part A: Lexing in Micro-Haskell [35 marks]In this part, we construct a lexical analyser for MH. A general-purpose longest-matchlexer is already provided. Your task is to supply deterministic finite-state machines thatserve as recognizers for the various lexical classes in the language.Look at the provided file GenLexer.java. It begins with some Java functions thatdefine certain useful classes of characters: letter ,small, large, digit,symbolic, whitespace,newline. Next comes a Java interface DFA which defines the functionality that any fi-nite state machine has to provide. Some of this is provided in the class Acceptor whichfollows, but notice that this class contains stubs for five ‘abstract’ methods whose implementationwill be specific to the particular DFA in question. There then follow three examplesof how to construct implementations of particular DFAs: EvenLetterAcceptor,AndAcceptor and SpaceAcceptor. Later in the file, the class DemoLexer illustrates howthese DFAs may be combined to yield a lexer for a simple (and silly) language, and theclass LexerDemo gives you a simple way of trying this out (the comments in the sourcefile explain how).Notice that states are represented by integers, with 0 as the initial state. Besidesthe transition operation and the set of accepting states, our DFAs here must also beequipped with a designated dead (or “garbage”) state: that is, a non-accepting statefrom which no sequence of transitions can lead to an accepting state. Note also that ourDFAs must include the method String lexClass(), which provides the name of thelexical class they are associated with. This is done because we wish our lexer to outputa stream of tokens each tagged with their lexical class.Your objective in Part A is to implement DFAs in the same style corresponding tothe lexical classes of Micro-Haskell. This is to be done in the file MH_Lexer.java, whichcurrently provides a template containing some gaps for you to fill in. For the first six ofthese gaps, follow the pattern of the examples in GenLexer.java to construct DFAs forthe following lexical classes defined by regular expressions. (These correspond closely tolexical classes of actual Haskell, except that we’ve chosen a slightly different definitionof the class NUM.) A class VAR of variables, defined bysmall (small + large + digit+) A class NUM of numeric literals, defined by0 + nonZeroDigit digit ?where nonZeroDigit means what you think it does.3 A class BOOLEAN of boolean literals, defined byTrue + False A class SYM of symbolic tokens, defined bysymbolic symbolic? A class of whitespace elements, defined bywhitespace whitespace? A class of comments, defined by- - -(nonSymbolNewline nonNewline? + ε)where nonSymbolNewline is the set of all characters except those of symbolic ornewline, and nonNewline is the set of all characters except those of newline. Notethat - - -effectively means ‘two or more dashes’.The names of the last two classes, implemented by the lexClass() method, should bothbe the empty string. This will notify the lexer that tokens of these classes should bediscarded.In addition to these classes, keywords such as if and special symbols such as ; willrequire ‘singleton’ (i.e. one-element) lexical classes all to themselves. For this purpose,we will provide a class tokAcceptor which, for any string tok that we supply, cancreate a special DFA that accepts the string tok and no other strings. For instance, theconstructor call tokAcceptor(if) should create a DFA that accepts only the stringif. Fill in the gap in the implementation of this class so as to achieve this behaviour.(Note that this is quite different from the other classes above, in that we will be able tocreate several objects of class tokAcceptor each implementing a different DFA.) Herethe name of the lexical class should be identical to the string itself — this will serve tomake the specific token we are dealing with visible to the parser.The lexical classes we require for MH are now the six lexical classes listed above,together with singleton classes for the five keywords Integer, Bool, if, then, else andfor the three special symbols (, ), ;.Following the example of class DemoLexer in the file GenLexer.java, add a fewlines of code to construct acceptors for these fourteen classes, and put these togetherin an array called MH_acceptors. The acceptors should be listed in order of priority,4with highest priority first, which should be sensibly chosen so that keywords like if areassigned an appropriate lexical class (so as to emulate the behaviour of an actual Haskellimplementation).The MH_acceptors array is fed to a general-purpose routine that performs longestmatchlexing (also known as maximal munch) using the method described in Lecture7. Take a brief look at the code for this in GenLexer.java, and check that you broadlyunderstand what it is doing.You should now be able to compile GenLexer.java and your file MH_Lexer.java tocreate a lexer for MH. To test your lexer, you might wish to adapt the LexerDemo codein GenLexer.java; this will allow you to create a simple command-line driven lexicalanalyser for MH. You are not required to submit this test code, however.Before leaving the topic of lexing, take a quick glance at the code provided inCheckedSymbolLexer.java. This performs some mild post-processing on the streamof lexical tokens: symbolic tokens are checked to ensure that they are among the tokensthat feature in MH::: -> = == If they are, then the lexical classname SYM is replaced with the token itself, just as forkeywords and (, ), ;.Submission: Submit your answer to part A, from a DICE machine, using thecommand: submit inf2a cw1 MH_Lexer.javaPart B: An LL(1) parser for Micro-Haskell [30 marks]Take a look at the provided file GenParser.java. This begins with an interface TREEand a class STree for representing syntax trees (for any context-free grammar). Theclass GenParser then provides an implementation of the general LL(1) parsing algorithmas described in lectures (again, check that you broadly understand it).In order to complete this and obtain a working parser, some grammar-specific ingredientsmust be provided: a parse table and a choice of start symbol. The classEvenAndParser gives a simple example of how to do this, for an artificial language thatuses the lexical classes defined in GenLexer.java. Note in particular the convention thatthe names of nonterminals are identified by adding the symbol # (we can get away withthis because # doesn’t itself feature in any lexical tokens of MH). You can try out thisparser on the sample input file EvenAnd_example.txt, by compiling GenParser.javaand then typing5Prog → � | Decl ProgDecl → TypeDecl TermDeclTypeDecl → VAR :: Type ;Type → Type0 TypeRestTypeRest → � | -> TypeType0 → Integer | Bool | ( Type )TermDecl → VAR Args = Exp ;Args → � | VAR ArgsExp → Exp0 | if Exp then Exp else ExpExp0 → Exp1 Rest0Rest0 → � | == Exp1 | Exp1 → Exp2 Rest1Rest1 → � | + Exp2 Rest1 | - Exp2 Rest1Exp2 → Exp3 Rest2Rest2 → � | Exp3 Rest2Exp3 → VAR | NUM | BOOLEAN | ( Exp )Figure 1: Grammar for Micro-Haskelljava ParserDemo EvenAnd_example.txtYour task is to implement a similar working parser for the language MH, in the fileMH_Parser.java (which is discussed below), following the pattern of EvenAndParser.Now we consider the grammar of MH itself. The terminal symbols are the names oflexical classes in tokens output by CheckedSymbolLexer. The complete list of these isas follows:VAR NUM BOOLEAN Integer Bool if then else( ) ; :: -> = == The start symbol of the grammar is Prog, and the productions are as given in Figure 1.(To reduce clutter, we omit the # symbol here, instead distinguishing nonterminals bychoice of font.)If this grammar looks daunting at first, the following observations may be helpful:6 The grammar for types (i.e. the rules for Type, TypeRest, Type0 ) is a self-containedsub-grammar that can be understood in isolation; see Lecture 14. The grammar for expressions (the rules for all nonterminals from Exp onwards) isanother self-contained sub-grammar, and is broadly similar in its workings to theLL(1) grammar for arithmetic expressions from Lecture 13. Note that the productionsfor Exp2 and Rest2 are intended to cater for 代写Inf2A留学生作业、代做Micro-Haskell作业、代做Java课程设计作业、代写Java程序作业 代做数据库multiple function applications,such as f x y. It may be helpful to study the sample program in MH_example.txt in conjunctionwith the grammar rules.Once you feel you have assimilated the grammar, find yourself a large sheet of paperand work out the complete LL(1) parse table. (Most of the entries will be blank, sodon’t panic!) You may find that some calculations of First and Follow sets (as presentedin Lecture 12) help you to do this; however, you will not be required to submit thesecalculations or the written-out parse table you construct.Now open the file MH_Parser.java. You will see that the right hand sides of all thegrammar rules have already been constructed for your convenience, so all you have to dois to supply an implementation of the parse table itself in the style of EvenAndParser.You may make use of auxiliary definitions and other reasonable devices to reduce theamount of code you need to write, provided that your code remains clearly readable andits correspondence to the parse table you have drawn up remains transparent.After completing and compiling this, you will now be able to try out your parser onthe sample source file provided:java MH_ParserDemo MH_example.txtIf this reports successful parsing, it’s an encouraging sign that your parser is largelycorrect and will obtain a reasonable mark. However, to ensure it is completely correct,you will have to do some further testing of your own, since (a) there are possible parsingscenarios not represented by this small example, and (b) you will also need to ensurethat your parser rejects incorrect programs and that the error reports it produces arereasonable.Submission: Submit your answer to part B, from a DICE machine, using thecommand: submit inf2a cw1 MH_Parser.java7Part C: Typechecking for Micro-Haskell [25 marks]In this section, you will implement critical parts of a typechecker for MH.The LL(1) grammar we have been using serves to disambiguate inputs and makethem readily parseable; but once these issues have been got out of the way, it’s muchmore convenient to work with simpler trees known as abstract syntax trees (ASTs) inwhich extraneous detail has been stripped away. For example, as in Lecture 14, typesin MH are conceptually just trees for the grammar:Type → Integer | Bool | Type -> TypeLook at the file MH_Type_Impl.java, which defines a Java representation of MH typesin this stripped-down form. The interface MH_TYPE declares various operations one canperform on such types (check that you understand what they are intended to do), whilefurther down, the class MH_Type_Impl provides predefined constants for the MH typesInteger and Bool, as well as a constructor for building a function type (i.e. an arrowtype) from two previously existing MH types. In the typechecking code you will bewriting, these may be utilized as follows:MH_Type_Impl.IntegerType ; // AST for IntegerMH_Type_Impl.BoolType; // AST for Boolnew MH_Type_Impl (t1,t2); // AST for (t1->t2)Clearly, we will need a way to convert syntax trees as produced by the parserinto ASTs of this kind. This is done by the provided methods convertType andconvertType1 in MH_Type_Impl.java. A good warm-up to your own task would beto try and understand the workings of convertType and convertType1 with the helpof the comments provided.A similar notion of abstract syntax trees is also required for expressions (trees withtopmost label #Exp). In effect, ASTs for expressions are just trees for the simplifiedgrammar:Exp → VAR | NUM | BOOLEAN | Exp Exp | Exp infix Exp | if Exp then Exp else Expwhere infix ranges over ==, MH_EXP, which declares various operations that can be performed on such trees.The intended meanings of these operations are all you need to understand from thisfile (and you can ignore isLAMBDA and isREF). Their workings are further explainedby commented examples in the file MH_Exp_Impl.java immediately below the MH_EXPinterface. However, you don’t need to get to grips with the class MH_Exp_Impl, which8contains (among other things) some code for converting trees returned by the parserinto ASTs for expressions.Assuming the conversions to ASTs have already been done, your task is to write atypechecker for ASTs, by completing the body of the method computeType in the fileMH_Typechecker.java. More precisely, your code should compute the MH type of anexpression given as an AST exp of Java type MH_EXP, returning the result as an AST ofJava type MH_TYPE. If the expression is not correctly typed, your code should flag up atype error, which you can do by means of the command:throw new TypeError (blah blah) ;Each time you include such a command, the string you supply should provide a briefdescription of the nature of the type error in question. Such error messages shouldbe helpful to MH programmers who need to identify and correct type errors in theirprograms.There is one other important ingredient to be explained. The type of an expressionsuch as if x then y else z, or even whether it is well-typed at all, will depend onthe types ascribed to the variables x, y, z. In general, then, an expression exp willhave to be typechecked relative to a type environment which maps certain variables tocertain types associated with them. This is the purpose of env, the second argument tocomputeType. You may access the type associated with the variable x, for instance, bycalling env.typeOf(x).The definition of the type of an expression (if it has one) is given compositionally:that is, it is computed from the types of its subexpressions. This will be reflected in therecursive nature of your implementation of computeType: it should compute the type ofany subexpressions in order to obtain the type of the whole expression. Here are somehints on how this should work:1. The types of NUMs and BOOLEANs are what you think they are.2. You should assume that each of the infix operations accepts only integer arguments;however, the type of the resulting expression will depend on the infixoperation in question.3. In an application expression e1 e2, the type of e2 should match the argument typeexpected by e1, and you should think about what the type of the whole expressionwill be.4. An expression if e1 then e2 else e3 may in principle have any type; however, youshould consider what the types of e1, e2, e3 respectively will need to be in order forthe whole expression to have type t.9A final hint on the Java side. To test whether two given type ASTs are equal, youshould use the equals method from the interface MH_TYPE, not the == operator.As a rough guideline, the model implementation of computeType consists of about40 lines of Java code.When you have finished, compile the files MH_Type_Impl.java, MH_Exp_Impl.javaand MH_Typechecker.java (in that order), and try executingjava MH_Typechecker MH_example.txtOnce your typechecker works, this will report that the parse, type conversion and typecheckhave all been successful. To see what it is doing, look again at MH_example.txt.The system is using your code to check that the right hand side of each function defi-nition has the expected type relative to an environment that can be inferred from thetypes specified in the MH code. (All this is managed by the remaining code in the fileMH_Typechecker.java.) You should also try out your typechecker on other MH programs— including some that contain type errors to check that your code catches themcorrectly and supplies appropriate error messages.Submission: Submit your answer to part C, from a DICE machine, using thecommand: submit inf2a cw1 MH_Typechecker.javaPart D: Execution of Micro-Haskell [10 marks]This part of the practical puts all the pieces together to obtain a full working implementationof Micro-Haskell.The file MH_Evaluator.java contains a rudimentary evaluator for Micro-Haskell.It uses the other parts of the practical to lex, parse and type-check a program, afterwhich, the resulting abstract syntax tree for the program is executed using a small-stepoperational semantics (as will be covered in Lecture 28). This results in an implementationthat’s pretty slow compared with a real-world implementation of Haskell,1 butits purpose is to illustrate how the basic principles of language processing feed into theconstruction of a compiler/interpreter/evaluator. You are encouraged to take a brieflook at the evaluator source code to get a rough idea of how it works (this may becomeclearer after Lecture 28). You will notice that the code here is relatively short: the bulkof the work has already been done at earlier stages of the language processing pipeline.1If you’re interested in how to produce a more efficient implementation, take the UG3 course onCompiling Techniques next year.10To use the evaluator, once you have completed the rest of the practical, compileMH_Evaluator.java and then run it on the source file of your choice, e.g.:java MH_Evaluator MH_example.txtThis will load and typecheck the MH program, and display a prompt MH>. Type in anexpression you would like to evaluate, and hit return. The expression may involve thefunctions declared in your MH program. Do this as many times as you like; e.g.:MH> 3+6...MH> fib 7...MH> gcd 104 (fib 12)...To quit, hit CTRL-c.This last part of the assignment does not require you to do any further coding. Yoursolutions to Parts A–C will be combined with the evaluator, and the resulting completeimplementation of MH will be marked for the correctness of its behaviour on a testsuite of five MH programs (different from those in MH_example.txt). However, in orderto gain assurance that your solutions to A–C will indeed combine to yield a correctimplementation of Micro-Haskell, you should spend a little time testing your completesystem on some small MH programs of your own devising (there is a certain amount offun to be had from this in any case). You should also be sure to test your system on asample of incorrect programs, to ensure that an appropriate error message is raised bythe lexer, parser or typechecker as appropriate.Support for this assignmentThe optional lab sessions in Weeks 5 and 6 will offer support for this assignment: labdemonstrators will be on hand to help you, particularly with any Java programmingissues.You may also post queries to the Piazza forum, but we would ask you to do so rathersparingly: in some previous years, the volume of queries has been overwhelming for staffand students alike! Please respect the following guidelines:1. Do not post a query to the forum unless you have already spent a significant time(say 40 minutes) trying to answer it yourself — by studying this handout or thelecture slides, by searching the Web for relevant help, etc.112. Take care not to post any part of an actual solution — not even one line of code!3. Feel free to answer queries from other students (whilst observing 2). This is muchappreciated by staff and will often mean the question gets answered more promptly.4. Don’t post a query without first checking whether the same query has alreadybeen posted and answered.25. Do keep your postings polite and respectful.3Note on markingYour submission will be marked by a human marker with the assistance of some autotestingprograms that will run your code on various test examples. If your code passesall or most of these tests, the human marker’s job will be easy; but if not, the markermay need to inspect your code to assess your level of understanding. It is therefore inyour interests to include at least a few comments in your code to show that you knowwhat you are doing (and commenting your code is good practice in any case).Most critically, please, please ensure that your code compiles correctly (in conjunctionwith the other source files provided) before you submit it! Very few marks will beavailable for submissions that do not compile.Mary Cryan, October 20182It’s in everyone’s interests to keep the forum traffic manageable: in previous years, the number ofpostings was so daunting as to discourage students from scanning them all — they would then postduplicate queries and the problem would snowball.3Even though you are posting anonymously to your classmates, Shay and I can request informationabout posters if the facility is abused!转自:http://ass.3daixie.com/2018110427692671.html
讲解:Inf2A、Micro-Haskell、Java、JavaSQL|Database
©著作权归作者所有,转载或内容合作请联系作者
- 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
- 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
- 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
推荐阅读更多精彩内容
- By clicking to agree to this Schedule 2, which is hereby ...