In this section, we learn how to use the components and simulation modules of the standard library and how to extend the standard library with new components.
This page closely follows the slides. Please consider these as notes for the presentation. It may be helpful to reference the slides while following these notes.
This tutorial relies heavily on code provided in materials/using-gem5/02-stdlib for exercises. Completed exercises can be found in materials/using-gem5/02-stdlib/complete.
Prior to the development of the gem5 stdlib users would need to create large configuration files, often 100s of lines of Python for even a basic simulation.
This was problematic. Typically this resulted in communities of gem5 users passing around pre-built scripts and modifying/extending them endlessly. We observed that typically users only needed to change a very small part in an otherwise “ordinary” simulation and, therefore, scripts containing “ordinary” simulation setups were frequently utilized.
In response to this the stdlib was created. The stdlib aims to remove “boilerplate” gem5 Python configuration code. It provides an abstraction where users can specify simulations at a high level and extend via established APIs. It is analogous to standard libraries provided by modern programming languages: A suite of helpful code to aid users, provided and supported by the project.
The standard library aims to be modular. That is, components should be interchangeable with components of the same type: I.e., it should be possible to swap one type of memory system with another, or one type of Processor with another. There are 4 high-level component types provided as part of the gem5 stdlib:
The stdlib enforces a Object Orientated design philosophy.
For example, to create a processor you must inherit from the AbstractProcessor
class and implement all the abstract methods.
All boards inherit from AbstractBoard
and accepts AbstractProcessor
, AbstractMemorySystem
and AbstractCacheHierarchy
in its constructor.
All the source code for the stdlib is found in the “src/python/gem5” directory in the gem5 repository.
As this is python, the directory structure maps to the import string.
For example, to import the SimpleBoard
declared within “src/python/gem5/components/boards/simple_board.py”, a user would include the following in their configuration file:
from gem5.components.boards.simple_board import SimpleBoard
To get started we create a very basic SE Mode simulation which will run a “Hello World” binary. This will use the stdlib.
The materials/using-gem5/02-stdlib/hello-world.py file has been provided as a starting point. It provides all the necessary imports. This file should be extended for this exercise. The completed example is found in materials/using-gem5/02-stdlib/complete/hello-world-complete.py.
Those following this tutorial should add the following lines:
# Obtain the components
cache_hierarchy = NoCache()
memory = SingleChannelDDR3_1600("1GiB")
processor = SimpleProcessor(cpu_type=CPUTypes.TIMING, num_cores=1)
This declares the components we’ll be using for this simulation:
SimpleProcessor
is a processor of N cores, all of the same type).board = SimpleBoard(
clk_freq="3GHz",
processor=processor,
memory=memory,
cache_hierarchy=cache_hierarchy,
)
This adds all the components to the board and sets the frequency to “3GHz”. In gem5 the frequency is set on the board.
The SimpleBoard
is used to run basic SE-Mode simulations.
The gem5 resources is a repository providing resources that are known to be compatible with gem5.
These resources are not necessary for the compilation or running of gem5 but may aid users in running simulations: E.g.: disk images, kernels, applications, cross-compilers, etc.
Resources are held on gem5’s Google Cloud Bucket and the sources for these resources are found at:https://gem5.googlesource.com/public/gem5-resources/
The stdlib can automatically obtain and use these resources.
It is our hope to, at some point, provide a better web interface to navigate the resources. However, at present, the best approach is to navigate the https://resources.gem5.org/resources.json file. Warning: This is a machine readable format.
In the “resources.json” file you can see each resource. Each as a “name” which can be used to reference (and obtain) a given resources. E.g., “riscv-disk-img” is the ID/name for a simple RISCV disk image based on busybox”.
The following will obtain the “riscv-disk-img” resource then print where it was obtained:
gem5-x86 materials/using-gem5/02-stdlib/obtaining-resources.py
Please consult the materials/using-gem5/02-stdlib/obtaining-resources.py file to see how this is done.
It is the Resource
class which does all the work.
from gem5.resources.resource import Resource
resource = Resource("riscv-disk-img")
print(f"the resource is available at {resource.get_local_path()}")
If the same command is run again, the resource will not be re-retrieved. gem5-resources caches what is required.
If a resource is required which is not provided by gem5-resources but instead provided by the gem5 user, CustomResource
class can be used.
from gem5.resources.resource import CustomResource
CustomResource("tests/test-progs/hello/bin/x86/linux/hello")
Going back to “materials/using-gem5/02-stdlib/hello-world.py” we add:
# Set the workload.
binary = Resource("x86-hello64-static")
board.set_se_binary_workload(binary)
The set_se_binary_workload
function is used to set a Syscall Emulation mode workload with a single binary.
The following should be appended:
# Setup the simulator and run the simulation.
simulator = Simulator(board=board)
simulator.run()
This declares the board that is to be simulated.
The run()
function begins the simulation.
Save the file then execute:
gem5-x86 materials/using-gem5/02-stdlib/hello-world.py
This will run the simulation and produce an output to the terminal giving information on the simulation.
To get stats users can look into “m5out/stats.txt”.
The “m5out/config.” file detail the system configuration. Various formats exist. “m5out/config.ini” is the most human-readable.
*In full-system simulation the terminal output can be found in the “m5out” directory also.
gem5 is modular. In general, you can replace components with other components of the same type.
In this example we will add a real cache implementation to the design.
The materials/using-gem5/02-stdlib/hello-world.py file from the previous examples will be extended for this. A completed example can be found in materials/using-gem5/02-stdlib/complete/hello-world-with-cache-complete.py.
#from gem5.components.cachehierarchies.classic.no_cache import NoCache
from gem5.components.cachehierarchies.classic.private_l1_private_l2_cache_hierarchy import PrivateL1PrivateL2CacheHierarchy
This imports the PrivateL1PrivateL2CacheHierarchy
.
# cache_hierarchy = NoCache()
cache_hierarchy = PrivateL1PrivateL2CacheHierarchy(
l1d_size="32KiB",
l1i_size="32KiB",
l2_size="64KiB",
)
This sets up our new cache hierarchy. The simulation “Hello World” simulation can then be rerun with:
gem5-x86 materials/using-gem5/02-stdlib/hello-world.py
Short exercise: Compete the “m5out/stats.txt” for running the simulation with and without a cache hierarchy.
In this exercise a X86 full system simulation will be created. While including a more advanced full system setup, it will also establish more advanced ideas of how to use the stdlib.
The materials/02-stdlib/x86-full-system.py provides a starting point for this exercise. It is assumed those following along with this exercise will append this file. The completed example for this exercise is found in materials/using-gem5/02-stdlib/complete/x86-full-system-complete.py.
To begin:
requires(
isa_required=ISA.X86,
coherence_protocol_required=CoherenceProtocol.MESI_TWO_LEVEL,
)
This adds a check for the gem5 binary parsing the script.
It is a safety to ensure the gem5 binary you are running is correct.
While using this requires
function is optional, it can stop the simulation failing in a catastrophic manner.
In this case the requires
function is checking:
Moving on we add the following:
cache_hierarchy = MESITwoLevelCacheHierarchy(
l1d_size="32KiB",
l1d_assoc=8,
l1i_size="32KiB",
l1i_assoc=8,
l2_assoc=16,
num_l2_banks=1,
)
memory = SingleChannelDDR3_1600("2GiB")
This adds a Ruby Cache: The MESITwoLevelCacheHierarchy
.
The interface for Ruby Caches and Classic Caches are the same at this level and therefore are interchangeable.
This is nothing particularly special about the memory system.
It is a 2GiB Single Channel DDR3 1600 setup.
processor = SimpleSwitchableProcessor(
starting_core_type=CPUTypes.TIMING,
switch_core_type=CPUTypes.O3,
num_cores=2,
)
The SimpleSwitchingProcessor
allows for different types of cores to be swapped during a simulation with processor.switch()
.
This can be a useful tactic when wanting to switch to and from a faster (but lower accuracy) CPU cores, to a slower (but more accurate) CPU cores. The faster CPU cores can be used to execute regions of code of lesser interest while the more detailed cores can be used on regions of code of greater interest (e.g., you can boot the OS on a faster CPU type execute the program of interest on the slower, more accurate CPU type).
We then add all these components to the board:
board = X86Board(
clk_freq="3GHz",
processor=processor,
memory=memory,
cache_hierarchy=cache_hierarchy,
)
In this case we use the X86Board
;
a board used to execute Full-System simulations using the X86 ISA.
Next we add:
command = "m5 exit;" \
+ "echo 'This is running on O3 CPU cores.';" \
+ "sleep 1;" \
+ "m5 exit;"
board.set_kernel_disk_workload(
kernel=Resource("x86-linux-kernel-5.4.49"),
disk_image=Resource("x86-ubunutu-18.04-img"),
readfile_conents=command,
)
The set_kernel_disk_workload
function is used to run a full system workload.
The kernel
and disk_image
resources must be specified as a resource.
In this case we are using the x86 Linux v5.4.49 resource and the Ubuntu 18.04 resource.
The readfile_contents
parameter utilizes gem5’s readfile
command (part of m5 ops
).
It is a way of passing a file’s contents to the gem5 simulation.
It can then be used by the simulation.
The “x86-ubuntu-18.04-img” disk image resource will copy the contents to a file then execute it as a strict, if set, once the boot is complete (if not set, the simulation simply exits after a boot).
Therefore, in this example the value of readfile_contents
is a shell script to be executed:
An m5 exit
instruction, an echo statement, a sleep statement, then another m5 exit
instruction.
During a simulation you can have “Exit Events”.
In this example there are two, both triggered by the m5 exit
instructions.
The simulation loop (that in which the simulation executes) exists upon reaching an exit event and returns to the Python configuration script.
Once back in the configuration script other actions can be taken then the simulation loop re-entered via a simulator.run()
command.
When the simulation loop is re-entered it re-enters into the same stats in which it left.
Go to the materials/using-gem5/02-stdlib/simulator-use.py file.
This is a pretty normal looking script, very similar to the “Hello World” program except we are executing the “m5-exit-example” binary.
Navigating to materials/using-gem5/02-stdlib/m5-exit-example/m5-exit/example.c we can see the source.
In it a m5_exit(0)
command is called indefinitely while also printing how many times the the program has exited thus far.
Going back to materials/using-gem5/02-stdlib/simulator-use.py we can append the following to the file:
simulator = Simulator(board=board)
simulator.run()
print("We can do stuff after an m5 exit event. Prior to continuing the simulation")
simulator.run()
print("And again...")
simulator.run()
This can then be ran with:
gem5-x86 materials/using-gem5/02-stdlib/simulator-use.py
The purpose of this exercise is to show that it’s possible to exit the simulation loop and run things in the configuration script before re-entering the simulation loop.
We can, however, be more clever. Remove the code appended to the file and instead, add the following:
simulator = Simulator(
board=board,
on_exit_event={
ExitEvent.Exit: (print(statement) for statement in ["Just exited the first exit event",
"Just exited the second exit event",
"Just exited the third exit event"])
}
)
Again we execute with:
gem5-x86 materials/using-gem5/02-stdlib/simulator-use.py
Here we are making use of the Simulator
class’s on_exit_event
parameter.
This parameter is a Python Dict which maps exit event types to a generator which yields on each execution of that exit event.
In this example we are simply printing a statement for the first three times the exit event is reached.
Here we’re only covering the “Exit” type exit event but there are other types.
You can override different types for different things. The Simulator modular has default behavior for each (see gem5/src/python/gem5/simulate/exit_event_generators.py).
Going back to “materials/using-gem5/02-stdlib/x86-full-system.py” and add the following
simulator = Simulator(
board=board,
on_exit_event={
ExitEvent.Exit : (func() for func in [processor.switch]),
},
)
simulator.run()
At this point the simulation setup is complete. It can be executed using:
gem5-x86 materials/using-gem5/02-stdlib/x86-full-system.py
Warning: This will take a a long time to complete execution. Use Cntrl+C to exit.
In this exercise we’ll extend the gem5 stdlib to add a new component.
In this we’ll be creating a new classic cache hierarchy called UniqueCacheHierarchy
.
In this exercise we’ll be modifying files in materials/using-gem5/02-stdlib/unique_cache_hierarchy/. A completed exercise example can be found in materials/using-gem5/02-stdlib/complete/unique_cache_hierarchy.
Starting with “materials/using-gem5/02-stdlib/unique_cache_hierarchy/unique_cache_hierarchy.py” we have extended the AbstractClassicCacheHierarchy
class and have exposed the abstract methods that must be implemented.
Going through these one by one:
def __init__(self) -> None:
AbstractClassicCacheHierarchy.__init__(self=self)
self.membus = SystemXBar(width=64)
Here we setup a memory bus for our simulations
def get_mem_side_port(self) -> Port:
return self.membus.mem_side_ports
Here we set the cache hierarchy memory-side ports to that of the membus.
def get_cpu_side_port(self) -> Port:
return self.membus.cpu_side_ports
Here we set the cache hierarchy CPU side ports to that of the membus.
Next we navigate to materials/using-gem5/02-stdlib/unique_cache_hierarchy/l1cache.py.
We require an implementation of the Cache
SimObject for the cache hierarchy component.
In this file the Cache
SimObject has been extended but the SimObject’s member variables must be set, which we do as so:
class L1Cache(Cache):
"""
A simple cache with default values.
"""
def __init__(self):
super().__init__()
self.size="32KiB"
self.assoc=8
self.tag_latency=1
self.data_latency=1
self.response_latency=1
self.mshrs=16
self.tgts_per_mshr=20
self.writeback_clean=True
self.prefetcher = StridePrefetcher()
SimObject member variables are special.
You can only set SimObject variables.
Code like self.custom_variable = 7
will cause an error.
If you want to create a non-SimObject variable, the variable name must have a preceding underscore: e.g., self._custom_variable = 7
.
Going back to “materials/using-gem5/02-stdlib/unique_cache_hierarchy/unique_cache_hierarchy.py”, we will implement the incrporate_cache
function:
def incorporate_cache(self, board: AbstractBoard) -> None:
# Set up the system port for functional access from the simulator.
board.connect_system_port(self.membus.cpu_side_ports)
for cntr in board.get_memory().get_memory_controllers():
cntr.port = self.membus.mem_side_ports
self.l1icaches = [
L1Cache()
for i in range(board.get_processor().get_num_cores())
]
self.l1dcaches = [
L1Cache()
for i in range(board.get_processor().get_num_cores())
]
for i, cpu in enumerate(board.get_processor().get_cores()):
cpu.connect_icache(self.l1icaches[i].cpu_side)
cpu.connect_dcache(self.l1dcaches[i].cpu_side)
self.l1icaches[i].mem_side = self.membus.cpu_side_ports
self.l1dcaches[i].mem_side = self.membus.cpu_side_ports
int_req_port = self.membus.mem_side_ports
int resp_port = self.membus.cpu_side_ports
cpu.connect_interrupt(int_req_port, int_resp_port)
from here the custom cache hierarchy can be incorporated into a stdlib design by simply importing the class correctly.
As a challenge, create your own gem5 Private L1 Shared L2 Cache hierarchy.
Once done, modify the “hello-world.py” program to see the performance difference between using the Private L1 Cache/Private L2 Cache vs your custom Private L1/Shared L2 Cache hierarchy.