William stallings computer organization and architecture 6th edition - Chapter 13: Reduced instruction set computers

Quantitative compare program sizes and execution speeds Qualitative examine issues of high level language support and use of VLSI real estate Problems No pair of RISC and CISC that are directly comparable No definitive set of test programs Difficult to separate hardware effects from complier effects Most comparisons done on “toy” rather than production machines Most commercial devices are a mixture

ppt38 trang | Chia sẻ: nguyenlam99 | Ngày: 04/01/2019 | Lượt xem: 247 | Lượt tải: 0download
Bạn đang xem trước 20 trang tài liệu William stallings computer organization and architecture 6th edition - Chapter 13: Reduced instruction set computers, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên
William Stallings Computer Organization and Architecture 6th EditionChapter 13Reduced InstructionSet ComputersMajor Advances in Computers(1)The family conceptIBM System/360 1964DEC PDP-8Separates architecture from implementationMicroporgrammed control unitIdea by Wilkes 1951Produced by IBM S/360 1964Cache memoryIBM S/360 model 85 1969Major Advances in Computers(2)Solid State RAM(See memory notes)MicroprocessorsIntel 4004 1971PipeliningIntroduces parallelism into fetch execute cycleMultiple processorsThe Next Step - RISCReduced Instruction Set ComputerKey featuresLarge number of general purpose registersor use of compiler technology to optimize register useLimited and simple instruction setEmphasis on optimising the instruction pipelineComparison of processorsDriving force for CISCSoftware costs far exceed hardware costsIncreasingly complex high level languagesSemantic gapLeads to:Large instruction setsMore addressing modesHardware implementations of HLL statementse.g. CASE (switch) on VAXIntention of CISCEase compiler writingImprove execution efficiencyComplex operations in microcodeSupport more complex HLLsExecution CharacteristicsOperations performedOperands usedExecution sequencingStudies have been done based on programs written in HLLsDynamic studies are measured during the execution of the programOperationsAssignmentsMovement of dataConditional statements (IF, LOOP)Sequence controlProcedure call-return is very time consumingSome HLL instruction lead to many machine code operationsRelative Dynamic Frequency Dynamic Machine Instruction Memory Reference Occurrence (Weighted) (Weighted) Pascal C Pascal C Pascal C Assign 45 38 13 13 14 15 Loop 5 3 42 32 33 26 Call 15 12 31 33 44 45 If 29 43 11 21 7 13 GoTo - 3 - - - - Other 6 1 3 1 2 1OperandsMainly local scalar variablesOptimisation should concentrate on accessing local variables Pascal C Average Integer constant 16 23 20 Scalar variable 58 53 55 Array/structure 26 24 25Procedure CallsVery time consumingDepends on number of parameters passedDepends on level of nestingMost programs do not do a lot of calls followed by lots of returnsMost variables are local(c.f. locality of reference)ImplicationsBest support is given by optimising most used and most time consuming featuresLarge number of registersOperand referencingCareful design of pipelinesBranch prediction etc.Simplified (reduced) instruction setLarge Register FileSoftware solutionRequire compiler to allocate registersAllocate based on most used variables in a given timeRequires sophisticated program analysisHardware solutionHave more registersThus more variables will be in registersRegisters for Local VariablesStore local scalar variables in registersReduces memory accessEvery procedure (function) call changes localityParameters must be passedResults must be returnedVariables from calling programs must be restoredRegister WindowsOnly few parametersLimited range of depth of callUse multiple small sets of registersCalls switch to a different set of registersReturns switch back to a previously used set of registersRegister Windows cont.Three areas within a register setParameter registersLocal registersTemporary registersTemporary registers from one set overlap parameter registers from the nextThis allows parameter passing without moving dataOverlapping Register WindowsCircular Buffer diagramOperation of Circular BufferWhen a call is made, a current window pointer is moved to show the currently active register windowIf all windows are in use, an interrupt is generated and the oldest window (the one furthest back in the call nesting) is saved to memoryA saved window pointer indicates where the next saved windows should restore toGlobal VariablesAllocated by the compiler to memoryInefficient for frequently accessed variablesHave a set of registers for global variablesRegisters v CacheLarge Register File CacheAll local scalars Recently used local scalarsIndividual variables Blocks of memoryCompiler assigned global variables Recently used global variablesSave/restore based on procedure Save/restore based on nesting caching algorithm Register addressing Memory addressingReferencing a Scalar - Window Based Register FileReferencing a Scalar - CacheCompiler Based Register OptimizationAssume small number of registers (16-32)Optimizing use is up to compilerHLL programs have no explicit references to registersusually - think about C - register intAssign symbolic or virtual register to each candidate variable Map (unlimited) symbolic registers to real registersSymbolic registers that do not overlap can share real registersIf you run out of real registers some variables use memoryGraph ColoringGiven a graph of nodes and edgesAssign a color to each nodeAdjacent nodes have different colorsUse minimum number of colorsNodes are symbolic registersTwo registers that are live in the same program fragment are joined by an edgeTry to color the graph with n colors, where n is the number of real registersNodes that can not be colored are placed in memoryGraph Coloring ApproachWhy CISC (1)?Compiler simplification?DisputedComplex machine instructions harder to exploitOptimization more difficultSmaller programs?Program takes up less memory butMemory is now cheapMay not occupy less bits, just look shorter in symbolic formMore instructions require longer op-codesRegister references require fewer bitsWhy CISC (2)?Faster programs?Bias towards use of simpler instructionsMore complex control unitMicroprogram control store largerthus simple instructions take longer to executeIt is far from clear that CISC is the appropriate solutionRISC CharacteristicsOne instruction per cycleRegister to register operationsFew, simple addressing modesFew, simple instruction formatsHardwired design (no microcode)Fixed instruction formatMore compile time/effortRISC v CISCNot clear cutMany designs borrow from both philosophiese.g. PowerPC and Pentium IIRISC PipeliningMost instructions are register to registerTwo phases of executionI: Instruction fetchE: ExecuteALU operation with register input and outputFor load and storeI: Instruction fetchE: ExecuteCalculate memory addressD: MemoryRegister to memory or memory to register operationEffects of PipeliningOptimization of PipeliningDelayed branchDoes not take effect until after execution of following instructionThis following instruction is the delay slotNormal and Delayed BranchAddress Normal Delayed Optimized100 LOAD X,A LOAD X,A LOAD X,A101 ADD 1,A ADD 1,A JUMP 105102 JUMP 105 JUMP 105 ADD 1,A103 ADD A,B NOOP ADD A,B104 SUB C,B ADD A,B SUB C,B105 STORE A,Z SUB C,B STORE A,Z106 STORE A,ZUse of Delayed BranchControversyQuantitativecompare program sizes and execution speedsQualitativeexamine issues of high level language support and use of VLSI real estateProblemsNo pair of RISC and CISC that are directly comparableNo definitive set of test programsDifficult to separate hardware effects from complier effectsMost comparisons done on “toy” rather than production machinesMost commercial devices are a mixtureRequired ReadingStallings chapter 13Manufacturer web sites

Các file đính kèm theo tài liệu này:

  • pptch_13_1024_765.ppt
Tài liệu liên quan