Intel® Microarchitecture Code Named GoldmontPlus Events
This section provides reference for hardware events that can be monitored for the CPU(s):
  • Events for Intel® microarchitecture code name GoldmontPlus
  • CORE
    Event Name Description Additional Info EventType
    INST_RETIRED.ANY Counts the number of instructions that retire execution. For instructions that consist of multiple uops, this event counts the retirement of the last uop of the instruction. The counter continues counting during hardware interrupts, traps, and inside interrupt handlers. This event uses fixed counter 0. IA32_FIXED_CTR0
    PEBS:[PreciseEventingIP]
    Architectural, Fixed
    CoreOnly
    CPU_CLK_UNHALTED.CORE Counts the number of core cycles while the core is not in a halt state. The core enters the halt state when it is running the HLT instruction. In mobile systems the core frequency may change from time to time. For this reason this event may have a changing ratio with regards to time. This event uses fixed counter 1. IA32_FIXED_CTR1
    PEBS:[NonPreciseEventingIP]
    Architectural, Fixed
    CoreOnly
    CPU_CLK_UNHALTED.REF_TSC Counts the number of reference cycles that the core is not in a halt state. The core enters the halt state when it is running the HLT instruction. In mobile systems the core frequency may change from time. This event is not affected by core frequency changes but counts as if the core is running at the maximum frequency all the time. This event uses fixed counter 2. IA32_FIXED_CTR2
    PEBS:[NonPreciseEventingIP]
    Architectural, Fixed
    CoreOnly
    BR_INST_RETIRED.ALL_BRANCHES Counts branch instructions retired for all branch types. This is an architectural performance event. EventSel=C4H UMask=00H
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, Counter=0,1,2,3, PDISTCounter=0]
    Architectural
    CoreOnly
    BR_MISP_RETIRED.ALL_BRANCHES Counts mispredicted branch instructions retired including all branch types. EventSel=C5H UMask=00H
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, Counter=0,1,2,3, PDISTCounter=0]
    Architectural
    CoreOnly
    CPU_CLK_UNHALTED.CORE_P Core cycles when core is not halted. This event uses a (_P)rogrammable general purpose performance counter. EventSel=3CH UMask=00H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    Architectural
    CoreOnly
    CPU_CLK_UNHALTED.REF Reference cycles when core is not halted. This event uses a (_P)rogrammable general purpose performance counter. EventSel=3CH UMask=01H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    Architectural
    CoreOnly
    INST_RETIRED.ANY_P Counts the number of instructions that retire execution. For instructions that consist of multiple uops, this event counts the retirement of the last uop of the instruction. The event continues counting during hardware interrupts, traps, and inside interrupt handlers. This is an architectural performance event. This event uses a (_P)rogrammable general purpose performance counter. *This event is Precise Event capable: The EventingRIP field in the PEBS record is precise to the address of the instruction which caused the event. EventSel=C0H UMask=00H
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, Counter=0,1,2,3, PDISTCounter=0]
    Architectural
    CoreOnly
    LONGEST_LAT_CACHE.MISS Counts memory requests originating from the core that miss in the L2 cache. EventSel=2EH UMask=41H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    Architectural
    CoreOnly
    LONGEST_LAT_CACHE.REFERENCE Counts memory requests originating from the core that reference a cache line in the L2 cache. EventSel=2EH UMask=4FH
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    Architectural
    CoreOnly
    BACLEARS.ALL Counts the number of times a BACLEAR is signaled for any reason, including, but not limited to indirect branch/call, Jcc (Jump on Conditional Code/Jump if Condition is Met) branch, unconditional branch/call, and returns. EventSel=E6H UMask=01H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    BACLEARS.COND Counts BACLEARS on Jcc (Jump on Conditional Code/Jump if Condition is Met) branches. EventSel=E6H UMask=10H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    BACLEARS.RETURN Counts BACLEARS on return instructions. EventSel=E6H UMask=08H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    BR_INST_RETIRED.ALL_TAKEN_BRANCHES Counts the number of taken branch instructions retired. EventSel=C4H UMask=80H
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    BR_INST_RETIRED.CALL Counts near CALL branch instructions retired. EventSel=C4H UMask=F9H
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    BR_INST_RETIRED.FAR_BRANCH Counts far branch instructions retired. This includes far jump, far call and return, and Interrupt call and return. EventSel=C4H UMask=BFH
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    BR_INST_RETIRED.IND_CALL Counts near indirect CALL branch instructions retired. EventSel=C4H UMask=FBH
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    BR_INST_RETIRED.JCC Counts retired Jcc (Jump on Conditional Code/Jump if Condition is Met) branch instructions retired, including both when the branch was taken and when it was not taken. EventSel=C4H UMask=7EH
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    BR_INST_RETIRED.NON_RETURN_IND Counts near indirect call or near indirect jmp branch instructions retired. EventSel=C4H UMask=EBH
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    BR_INST_RETIRED.REL_CALL Counts near relative CALL branch instructions retired. EventSel=C4H UMask=FDH
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    BR_INST_RETIRED.RETURN Counts near return branch instructions retired. EventSel=C4H UMask=F7H
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    BR_INST_RETIRED.TAKEN_JCC Counts Jcc (Jump on Conditional Code/Jump if Condition is Met) branch instructions retired that were taken and does not count when the Jcc branch instruction were not taken. EventSel=C4H UMask=FEH
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    BR_MISP_RETIRED.IND_CALL Counts mispredicted near indirect CALL branch instructions retired, where the target address taken was not what the processor predicted. EventSel=C5H UMask=FBH
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    BR_MISP_RETIRED.JCC Counts mispredicted retired Jcc (Jump on Conditional Code/Jump if Condition is Met) branch instructions retired, including both when the branch was supposed to be taken and when it was not supposed to be taken (but the processor predicted the opposite condition). EventSel=C5H UMask=7EH
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    BR_MISP_RETIRED.NON_RETURN_IND Counts mispredicted branch instructions retired that were near indirect call or near indirect jmp, where the target address taken was not what the processor predicted. EventSel=C5H UMask=EBH
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    BR_MISP_RETIRED.RETURN Counts mispredicted near RET branch instructions retired, where the return address taken was not what the processor predicted. EventSel=C5H UMask=F7H
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    BR_MISP_RETIRED.TAKEN_JCC Counts mispredicted retired Jcc (Jump on Conditional Code/Jump if Condition is Met) branch instructions retired that were supposed to be taken but the processor predicted that it would not be taken. EventSel=C5H UMask=FEH
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    CORE_REJECT_L2Q.ALL Counts the number of demand and L1 prefetcher requests rejected by the L2Q due to a full or nearly full condition which likely indicates back pressure from L2Q. It also counts requests that would have gone directly to the XQ, but are rejected due to a full or nearly full condition, indicating back pressure from the IDI link. The L2Q may also reject transactions from a core to insure fairness between cores, or to delay a core's dirty eviction when the address conflicts with incoming external snoops. EventSel=31H UMask=00H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    CPU_CLK_UNHALTED.THREAD Counts the number of core cycles while the core is not in a halt state. The core enters the halt state when it is running the HLT instruction. In mobile systems the core frequency may change from time to time. For this reason this event may have a changing ratio with regards to time. This event uses fixed counter 1. IA32_FIXED_CTR1
    PEBS:[NonPreciseEventingIP]
    Fixed
    CoreOnly
    CYCLES_DIV_BUSY.ALL Counts core cycles if either divide unit is busy. EventSel=CDH UMask=00H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    CYCLES_DIV_BUSY.FPDIV Counts core cycles the floating point divide unit is busy. EventSel=CDH UMask=02H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    CYCLES_DIV_BUSY.IDIV Counts core cycles the integer divide unit is busy. EventSel=CDH UMask=01H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    DECODE_RESTRICTION.PREDECODE_WRONG Counts the number of times the prediction (from the predecode cache) for instruction length is incorrect. EventSel=E9H UMask=01H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    DL1.REPLACEMENT Counts when a modified (dirty) cache line is evicted from the data L1 cache and needs to be written back to memory. No count will occur if the evicted line is clean, and hence does not require a writeback. EventSel=51H UMask=01H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    DTLB_LOAD_MISSES.WALK_COMPLETED_1GB Counts page walks completed due to demand data loads (including SW prefetches) whose address translations missed in all TLB levels and were mapped to 1GB pages. The page walks can end with or without a page fault. EventSel=08H UMask=08H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    DTLB_LOAD_MISSES.WALK_COMPLETED_2M_4M Counts page walks completed due to demand data loads (including SW prefetches) whose address translations missed in all TLB levels and were mapped to 2M or 4M pages. The page walks can end with or without a page fault. EventSel=08H UMask=04H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    DTLB_LOAD_MISSES.WALK_COMPLETED_4K Counts page walks completed due to demand data loads (including SW prefetches) whose address translations missed in all TLB levels and were mapped to 4K pages. The page walks can end with or without a page fault. EventSel=08H UMask=02H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    DTLB_LOAD_MISSES.WALK_PENDING Counts once per cycle for each page walk occurring due to a load (demand data loads or SW prefetches). Includes cycles spent traversing the Extended Page Table (EPT). Average cycles per walk can be calculated by dividing by the number of walks. EventSel=08H UMask=10H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    DTLB_STORE_MISSES.WALK_COMPLETED_1GB Counts page walks completed due to demand data stores whose address translations missed in the TLB and were mapped to 1GB pages. The page walks can end with or without a page fault. EventSel=49H UMask=08H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    DTLB_STORE_MISSES.WALK_COMPLETED_2M_4M Counts page walks completed due to demand data stores whose address translations missed in the TLB and were mapped to 2M or 4M pages. The page walks can end with or without a page fault. EventSel=49H UMask=04H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    DTLB_STORE_MISSES.WALK_COMPLETED_4K Counts page walks completed due to demand data stores whose address translations missed in the TLB and were mapped to 4K pages. The page walks can end with or without a page fault. EventSel=49H UMask=02H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    DTLB_STORE_MISSES.WALK_PENDING Counts once per cycle for each page walk occurring due to a demand data store. Includes cycles spent traversing the Extended Page Table (EPT). Average cycles per walk can be calculated by dividing by the number of walks. EventSel=49H UMask=10H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    EPT.WALK_PENDING Counts once per cycle for each page walk only while traversing the Extended Page Table (EPT), and does not count during the rest of the translation. The EPT is used for translating Guest-Physical Addresses to Physical Addresses for Virtual Machine Monitors (VMMs). Average cycles per walk can be calculated by dividing the count by number of walks. EventSel=4FH UMask=10H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    FETCH_STALL.ALL Counts cycles that fetch is stalled due to any reason. That is, the decoder queue is able to accept bytes, but the fetch unit is unable to provide bytes. This will include cycles due to an ITLB miss, ICache miss and other events. EventSel=86H UMask=00H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    FETCH_STALL.ICACHE_FILL_PENDING_CYCLES Counts cycles that fetch is stalled due to an outstanding ICache miss. That is, the decoder queue is able to accept bytes, but the fetch unit is unable to provide bytes due to an ICache miss. Note: this event is not the same as the total number of cycles spent retrieving instruction cache lines from the memory hierarchy. EventSel=86H UMask=02H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    FETCH_STALL.ITLB_FILL_PENDING_CYCLES Counts cycles that fetch is stalled due to an outstanding ITLB miss. That is, the decoder queue is able to accept bytes, but the fetch unit is unable to provide bytes due to an ITLB miss. Note: this event is not the same as page walk cycles to retrieve an instruction translation. EventSel=86H UMask=01H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    HW_INTERRUPTS.MASKED Counts the number of core cycles during which interrupts are masked (disabled). Increments by 1 each core cycle that EFLAGS.IF is 0, regardless of whether interrupts are pending or not. EventSel=CBH UMask=02H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    HW_INTERRUPTS.PENDING_AND_MASKED Counts core cycles during which there are pending interrupts, but interrupts are masked (EFLAGS.IF = 0). EventSel=CBH UMask=04H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    HW_INTERRUPTS.RECEIVED Counts hardware interrupts received by the processor. EventSel=CBH UMask=01H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    ICACHE.ACCESSES Counts requests to the Instruction Cache (ICache) for one or more bytes in an ICache Line. The event strives to count on a cache line basis, so that multiple fetches to a single cache line count as one ICACHE.ACCESS. Specifically, the event counts when accesses from straight line code crosses the cache line boundary, or when a branch target is to a new line. This event counts differently than Intel processors based on Silvermont microarchitecture. EventSel=80H UMask=03H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    ICACHE.HIT Counts requests to the Instruction Cache (ICache) for one or more bytes in an ICache Line and that cache line is in the ICache (hit). The event strives to count on a cache line basis, so that multiple accesses which hit in a single cache line count as one ICACHE.HIT. Specifically, the event counts when straight line code crosses the cache line boundary, or when a branch target is to a new line, and that cache line is in the ICache. This event counts differently than Intel processors based on Silvermont microarchitecture. EventSel=80H UMask=01H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    ICACHE.MISSES Counts requests to the Instruction Cache (ICache) for one or more bytes in an ICache Line and that cache line is not in the ICache (miss). The event strives to count on a cache line basis, so that multiple accesses which miss in a single cache line count as one ICACHE.MISS. Specifically, the event counts when straight line code crosses the cache line boundary, or when a branch target is to a new line, and that cache line is not in the ICache. This event counts differently than Intel processors based on Silvermont microarchitecture. EventSel=80H UMask=02H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    INST_RETIRED.PREC_DIST Counts INST_RETIRED.ANY using the Reduced Skid PEBS feature that reduces the shadow in which events aren't counted allowing for a more unbiased distribution of samples across instructions retired. EventSel=C0H UMask=00H
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, Counter=0, PDISTCounter=0]
    CoreOnly
    ISSUE_SLOTS_NOT_CONSUMED.ANY Counts the number of issue slots per core cycle that were not consumed by the backend due to either a full resource in the backend (RESOURCE_FULL) or due to the processor recovering from some event (RECOVERY). EventSel=CAH UMask=00H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    ISSUE_SLOTS_NOT_CONSUMED.RECOVERY Counts the number of issue slots per core cycle that were not consumed by the backend because allocation is stalled waiting for a mispredicted jump to retire or other branch-like conditions (e.g. the event is relevant during certain microcode flows). Counts all issue slots blocked while within this window including slots where uops were not available in the Instruction Queue. EventSel=CAH UMask=02H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    ISSUE_SLOTS_NOT_CONSUMED.RESOURCE_FULL Counts the number of issue slots per core cycle that were not consumed because of a full resource in the backend. Including but not limited to resources such as the Re-order Buffer (ROB), reservation stations (RS), load/store buffers, physical registers, or any other needed machine resource that is currently unavailable. Note that uops must be available for consumption in order for this event to fire. If a uop is not available (Instruction Queue is empty), this event will not count. EventSel=CAH UMask=01H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    ITLB.MISS Counts the number of times the machine was unable to find a translation in the Instruction Translation Lookaside Buffer (ITLB) for a linear address of an instruction fetch. It counts when new translation are filled into the ITLB. The event is speculative in nature, but will not count translations (page walks) that are begun and not finished, or translations that are finished but not filled into the ITLB. EventSel=81H UMask=04H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    ITLB_MISSES.WALK_COMPLETED_1GB Counts page walks completed due to instruction fetches whose address translations missed in the TLB and were mapped to 1GB pages. The page walks can end with or without a page fault. EventSel=85H UMask=08H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    ITLB_MISSES.WALK_COMPLETED_2M_4M Counts page walks completed due to instruction fetches whose address translations missed in the TLB and were mapped to 2M or 4M pages. The page walks can end with or without a page fault. EventSel=85H UMask=04H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    ITLB_MISSES.WALK_COMPLETED_4K Counts page walks completed due to instruction fetches whose address translations missed in the TLB and were mapped to 4K pages. The page walks can end with or without a page fault. EventSel=85H UMask=02H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    ITLB_MISSES.WALK_PENDING Counts once per cycle for each page walk occurring due to an instruction fetch. Includes cycles spent traversing the Extended Page Table (EPT). Average cycles per walk can be calculated by dividing by the number of walks. EventSel=85H UMask=10H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    L2_REJECT_XQ.ALL Counts the number of demand and prefetch transactions that the L2 XQ rejects due to a full or near full condition which likely indicates back pressure from the intra-die interconnect (IDI) fabric. The XQ may reject transactions from the L2Q (non-cacheable requests), L2 misses and L2 write-back victims. EventSel=30H UMask=00H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    LD_BLOCKS.4K_ALIAS Counts loads that block because their address modulo 4K matches a pending store. EventSel=03H UMask=04H
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    LD_BLOCKS.ALL_BLOCK Counts anytime a load that retires is blocked for any reason. EventSel=03H UMask=10H
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    LD_BLOCKS.DATA_UNKNOWN Counts a load blocked from using a store forward, but did not occur because the store data was not available at the right time. The forward might occur subsequently when the data is available. EventSel=03H UMask=01H
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    LD_BLOCKS.STORE_FORWARD Counts a load blocked from using a store forward because of an address/size mismatch, only one of the loads blocked from each store will be counted. EventSel=03H UMask=02H
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    LD_BLOCKS.UTLB_MISS Counts loads blocked because they are unable to find their physical address in the micro TLB (UTLB). EventSel=03H UMask=08H
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    MACHINE_CLEARS.ALL Counts machine clears for any reason. EventSel=C3H UMask=00H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    MACHINE_CLEARS.DISAMBIGUATION Counts machine clears due to memory disambiguation. Memory disambiguation happens when a load which has been issued conflicts with a previous unretired store in the pipeline whose address was not known at issue time, but is later resolved to be the same as the load address. EventSel=C3H UMask=08H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    MACHINE_CLEARS.FP_ASSIST Counts machine clears due to floating point (FP) operations needing assists. For instance, if the result was a floating point denormal, the hardware clears the pipeline and reissues uops to produce the correct IEEE compliant denormal result. EventSel=C3H UMask=04H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    MACHINE_CLEARS.MEMORY_ORDERING Counts machine clears due to memory ordering issues. This occurs when a snoop request happens and the machine is uncertain if memory ordering will be preserved - as another core is in the process of modifying the data. EventSel=C3H UMask=02H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    MACHINE_CLEARS.PAGE_FAULT Counts the number of times that the machines clears due to a page fault. Covers both I-side and D-side(Loads/Stores) page faults. A page fault occurs when either page is not present, or an access violation EventSel=C3H UMask=20H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    MACHINE_CLEARS.SMC Counts the number of times that the processor detects that a program is writing to a code section and has to perform a machine clear because of that modification. Self-modifying code (SMC) causes a severe penalty in all Intel® architecture processors. EventSel=C3H UMask=01H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    MEM_LOAD_UOPS_RETIRED.DRAM_HIT Counts memory load uops retired where the data is retrieved from DRAM. Event is counted at retirement, so the speculative loads are ignored. A memory load can hit (or miss) the L1 cache, hit (or miss) the L2 cache, hit DRAM, hit in the WCB or receive a HITM response. EventSel=D1H UMask=80H
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, DataLinearAddress, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    MEM_LOAD_UOPS_RETIRED.HITM Counts load uops retired where the cache line containing the data was in the modified state of another core or modules cache (HITM). More specifically, this means that when the load address was checked by other caching agents (typically another processor) in the system, one of those caching agents indicated that they had a dirty copy of the data. Loads that obtain a HITM response incur greater latency than most is typical for a load. In addition, since HITM indicates that some other processor had this data in its cache, it implies that the data was shared between processors, or potentially was a lock or semaphore value. This event is useful for locating sharing, false sharing, and contended locks. EventSel=D1H UMask=20H
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, DataLinearAddress, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    MEM_LOAD_UOPS_RETIRED.L1_HIT Counts load uops retired that hit the L1 data cache. EventSel=D1H UMask=01H
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, DataLinearAddress, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    MEM_LOAD_UOPS_RETIRED.L1_MISS Counts load uops retired that miss the L1 data cache. EventSel=D1H UMask=08H
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, DataLinearAddress, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    MEM_LOAD_UOPS_RETIRED.L2_HIT Counts load uops retired that hit in the L2 cache. EventSel=D1H UMask=02H
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, DataLinearAddress, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    MEM_LOAD_UOPS_RETIRED.L2_MISS Counts load uops retired that miss in the L2 cache. EventSel=D1H UMask=10H
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, DataLinearAddress, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    MEM_LOAD_UOPS_RETIRED.WCB_HIT Counts memory load uops retired where the data is retrieved from the WCB (or fill buffer), indicating that the load found its data while that data was in the process of being brought into the L1 cache. Typically a load will receive this indication when some other load or prefetch missed the L1 cache and was in the process of retrieving the cache line containing the data, but that process had not yet finished (and written the data back to the cache). For example, consider load X and Y, both referencing the same cache line that is not in the L1 cache. If load X misses cache first, it obtains and WCB (or fill buffer) and begins the process of requesting the data. When load Y requests the data, it will either hit the WCB, or the L1 cache, depending on exactly what time the request to Y occurs. EventSel=D1H UMask=40H
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, DataLinearAddress, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    MEM_UOPS_RETIRED.ALL Counts the number of memory uops retired that is either a loads or a store or both. EventSel=D0H UMask=83H
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, DataLinearAddress, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    MEM_UOPS_RETIRED.ALL_LOADS Counts the number of load uops retired. EventSel=D0H UMask=81H
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, DataLinearAddress, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    MEM_UOPS_RETIRED.ALL_STORES Counts the number of store uops retired. EventSel=D0H UMask=82H
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, DataLinearAddress, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    MEM_UOPS_RETIRED.DTLB_MISS Counts uops retired that had a DTLB miss on load, store or either. Note that when two distinct memory operations to the same page miss the DTLB, only one of them will be recorded as a DTLB miss. EventSel=D0H UMask=13H
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, DataLinearAddress, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    MEM_UOPS_RETIRED.DTLB_MISS_LOADS Counts load uops retired that caused a DTLB miss. EventSel=D0H UMask=11H
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, DataLinearAddress, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    MEM_UOPS_RETIRED.DTLB_MISS_STORES Counts store uops retired that caused a DTLB miss. EventSel=D0H UMask=12H
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, DataLinearAddress, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    MEM_UOPS_RETIRED.LOCK_LOADS Counts locked memory uops retired. This includes "regular" locks and bus locks. (To specifically count bus locks only, see the Offcore response event.) A locked access is one with a lock prefix, or an exchange to memory. See the SDM for a complete description of which memory load accesses are locks. EventSel=D0H UMask=21H
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, DataLinearAddress, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    MEM_UOPS_RETIRED.SPLIT Counts memory uops retired where the data requested spans a 64 byte cache line boundary. EventSel=D0H UMask=43H
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, DataLinearAddress, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    MEM_UOPS_RETIRED.SPLIT_LOADS Counts load uops retired where the data requested spans a 64 byte cache line boundary. EventSel=D0H UMask=41H
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, DataLinearAddress, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    MEM_UOPS_RETIRED.SPLIT_STORES Counts store uops retired where the data requested spans a 64 byte cache line boundary. EventSel=D0H UMask=42H
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, DataLinearAddress, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    MISALIGN_MEM_REF.LOAD_PAGE_SPLIT Counts when a memory load of a uop spans a page boundary (a split) is retired. EventSel=13H UMask=02H
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    MISALIGN_MEM_REF.STORE_PAGE_SPLIT Counts when a memory store of a uop spans a page boundary (a split) is retired. EventSel=13H UMask=04H
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    MS_DECODED.MS_ENTRY Counts the number of times the Microcode Sequencer (MS) starts a flow of uops from the MSROM. It does not count every time a uop is read from the MSROM. The most common case that this counts is when a micro-coded instruction is encountered by the front end of the machine. Other cases include when an instruction encounters a fault, trap, or microcode assist of any sort that initiates a flow of uops. The event will count MS startups for uops that are speculative, and subsequently cleared by branch mispredict or a machine clear. EventSel=E7H UMask=01H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    TLB_FLUSHES.STLB_ANY Counts STLB flushes. The TLBs are flushed on instructions like INVLPG and MOV to CR3. EventSel=BDH UMask=20H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    UOPS_ISSUED.ANY Counts uops issued by the front end and allocated into the back end of the machine. This event counts uops that retire as well as uops that were speculatively executed but didn't retire. The sort of speculative uops that might be counted includes, but is not limited to those uops issued in the shadow of a miss-predicted branch, those uops that are inserted during an assist (such as for a denormal floating point result), and (previously allocated) uops that might be canceled during a machine clear. EventSel=0EH UMask=00H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    UOPS_NOT_DELIVERED.ANY This event used to measure front-end inefficiencies. I.e. when front-end of the machine is not delivering uops to the back-end and the back-end has is not stalled. This event can be used to identify if the machine is truly front-end bound. When this event occurs, it is an indication that the front-end of the machine is operating at less than its theoretical peak performance. Background: We can think of the processor pipeline as being divided into 2 broader parts: Front-end and Back-end. Front-end is responsible for fetching the instruction, decoding into uops in machine understandable format and putting them into a uop queue to be consumed by back end. The back-end then takes these uops, allocates the required resources. When all resources are ready, uops are executed. If the back-end is not ready to accept uops from the front-end, then we do not want to count these as front-end bottlenecks. However, whenever we have bottlenecks in the back-end, we will have allocation unit stalls and eventually forcing the front-end to wait until the back-end is ready to receive more uops. This event counts only when back-end is requesting more uops and front-end is not able to provide them. When 3 uops are requested and no uops are delivered, the event counts 3. When 3 are requested, and only 1 is delivered, the event counts 2. When only 2 are delivered, the event counts 1. Alternatively stated, the event will not count if 3 uops are delivered, or if the back end is stalled and not requesting any uops at all. Counts indicate missed opportunities for the front-end to deliver a uop to the back end. Some examples of conditions that cause front-end efficiencies are: ICache misses, ITLB misses, and decoder restrictions that limit the front-end bandwidth. Known Issues: Some uops require multiple allocation slots. These uops will not be charged as a front end 'not delivered' opportunity, and will be regarded as a back end problem. For example, the INC instruction has one uop that requires 2 issue slots. A stream of INC instructions will not count as UOPS_NOT_DELIVERED, even though only one instruction can be issued per clock. The low uop issue rate for a stream of INC instructions is considered to be a back end issue. EventSel=9CH UMask=00H
    Counter=0,1,2,3
    PEBS:[NonPreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    UOPS_RETIRED.ANY Counts uops which retired. EventSel=C2H UMask=00H
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    UOPS_RETIRED.FPDIV Counts the number of floating point divide uops retired. EventSel=C2H UMask=08H
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    UOPS_RETIRED.IDIV Counts the number of integer divide uops retired. EventSel=C2H UMask=10H
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, Counter=0,1,2,3, PDISTCounter=0]
    CoreOnly
    UOPS_RETIRED.MS Counts uops retired that are from the complex flows issued by the micro-sequencer (MS). Counts both the uops from a micro-coded instruction, and the uops that might be generated from a micro-coded assist. EventSel=C2H UMask=01H
    Counter=0,1,2,3
    PEBS:[PreciseEventingIP, Counter=0,1,2,3]
    CoreOnly
    UNCORE
    OFFCORE
    OFFCORE_RESPONSE:request=DEMAND_DATA_RD: response=ANY_RESPONSE Counts demand cacheable data reads of full cache lineshave any transaction responses from the uncore subsystem. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=10001H Offcore
    OFFCORE_RESPONSE:request=DEMAND_DATA_RD: response=L2_HIT Counts demand cacheable data reads of full cache lineshit the L2 cache. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=40001H Offcore
    OFFCORE_RESPONSE:request=DEMAND_DATA_RD: response=L2_MISS.SNOOP_MISS_OR_NO_SNOOP_NEEDED Counts demand cacheable data reads of full cache linestrue miss for the L2 cache with a snoop miss in the other processor module. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=200000001H Offcore
    OFFCORE_RESPONSE:request=DEMAND_DATA_RD: response=L2_MISS.HITM_OTHER_CORE Counts demand cacheable data reads of full cache linesmiss the L2 cache with a snoop hit in the other processor module, data forwarding is required. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=1000000001H Offcore
    OFFCORE_RESPONSE:request=DEMAND_DATA_RD: response=OUTSTANDING Counts demand cacheable data reads of full cache lines that are outstanding, per cycle, from the time of the L2 miss to when any response is received. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx(1A6H)=4000000001H Offcore
    OFFCORE_RESPONSE:request=DEMAND_RFO: response=ANY_RESPONSE Counts demand reads for ownership (RFO) requests generated by a write to full data cache linehave any transaction responses from the uncore subsystem. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=10002H Offcore
    OFFCORE_RESPONSE:request=DEMAND_RFO: response=L2_HIT Counts demand reads for ownership (RFO) requests generated by a write to full data cache linehit the L2 cache. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=40002H Offcore
    OFFCORE_RESPONSE:request=DEMAND_RFO: response=L2_MISS.SNOOP_MISS_OR_NO_SNOOP_NEEDED Counts demand reads for ownership (RFO) requests generated by a write to full data cache linetrue miss for the L2 cache with a snoop miss in the other processor module. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=200000002H Offcore
    OFFCORE_RESPONSE:request=DEMAND_RFO: response=L2_MISS.HITM_OTHER_CORE Counts demand reads for ownership (RFO) requests generated by a write to full data cache linemiss the L2 cache with a snoop hit in the other processor module, data forwarding is required. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=1000000002H Offcore
    OFFCORE_RESPONSE:request=DEMAND_RFO: response=OUTSTANDING Counts demand reads for ownership (RFO) requests generated by a write to full data cache lineoutstanding, per cycle, from the time of the L2 miss to when any response is received. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx(1A6H)=4000000002H Offcore
    OFFCORE_RESPONSE:request=DEMAND_CODE_RD: response=ANY_RESPONSE Counts demand instruction cacheline and I-side prefetch requests that miss the instruction cachehave any transaction responses from the uncore subsystem. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=10004H Offcore
    OFFCORE_RESPONSE:request=DEMAND_CODE_RD: response=L2_HIT Counts demand instruction cacheline and I-side prefetch requests that miss the instruction cachehit the L2 cache. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=40004H Offcore
    OFFCORE_RESPONSE:request=DEMAND_CODE_RD: response=L2_MISS.SNOOP_MISS_OR_NO_SNOOP_NEEDED Counts demand instruction cacheline and I-side prefetch requests that miss the instruction cachetrue miss for the L2 cache with a snoop miss in the other processor module. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=200000004H Offcore
    OFFCORE_RESPONSE:request=DEMAND_CODE_RD: response=L2_MISS.HITM_OTHER_CORE Counts demand instruction cacheline and I-side prefetch requests that miss the instruction cachemiss the L2 cache with a snoop hit in the other processor module, data forwarding is required. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=1000000004H Offcore
    OFFCORE_RESPONSE:request=DEMAND_CODE_RD: response=OUTSTANDING Counts demand instruction cacheline and I-side prefetch requests that miss the instruction cacheoutstanding, per cycle, from the time of the L2 miss to when any response is received. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx(1A6H)=4000000004H Offcore
    OFFCORE_RESPONSE:request=COREWB: response=ANY_RESPONSE Counts the number of writeback transactions caused by L1 or L2 cache evictionshave any transaction responses from the uncore subsystem. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=10008H Offcore
    OFFCORE_RESPONSE:request=COREWB: response=L2_HIT Counts the number of writeback transactions caused by L1 or L2 cache evictionshit the L2 cache. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=40008H Offcore
    OFFCORE_RESPONSE:request=COREWB: response=L2_MISS.SNOOP_MISS_OR_NO_SNOOP_NEEDED Counts the number of writeback transactions caused by L1 or L2 cache evictionstrue miss for the L2 cache with a snoop miss in the other processor module. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=200000008H Offcore
    OFFCORE_RESPONSE:request=COREWB: response=L2_MISS.HITM_OTHER_CORE Counts the number of writeback transactions caused by L1 or L2 cache evictionsmiss the L2 cache with a snoop hit in the other processor module, data forwarding is required. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=1000000008H Offcore
    OFFCORE_RESPONSE:request=COREWB: response=OUTSTANDING Counts the number of writeback transactions caused by L1 or L2 cache evictionsoutstanding, per cycle, from the time of the L2 miss to when any response is received. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx(1A6H)=4000000008H Offcore
    OFFCORE_RESPONSE:request=PF_L2_DATA_RD: response=ANY_RESPONSE Counts data cacheline reads generated by hardware L2 cache prefetcherhave any transaction responses from the uncore subsystem. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=10010H Offcore
    OFFCORE_RESPONSE:request=PF_L2_DATA_RD: response=L2_HIT Counts data cacheline reads generated by hardware L2 cache prefetcherhit the L2 cache. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=40010H Offcore
    OFFCORE_RESPONSE:request=PF_L2_DATA_RD: response=L2_MISS.SNOOP_MISS_OR_NO_SNOOP_NEEDED Counts data cacheline reads generated by hardware L2 cache prefetchertrue miss for the L2 cache with a snoop miss in the other processor module. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=200000010H Offcore
    OFFCORE_RESPONSE:request=PF_L2_DATA_RD: response=L2_MISS.HITM_OTHER_CORE Counts data cacheline reads generated by hardware L2 cache prefetchermiss the L2 cache with a snoop hit in the other processor module, data forwarding is required. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=1000000010H Offcore
    OFFCORE_RESPONSE:request=PF_L2_DATA_RD: response=OUTSTANDING Counts data cacheline reads generated by hardware L2 cache prefetcheroutstanding, per cycle, from the time of the L2 miss to when any response is received. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx(1A6H)=4000000010H Offcore
    OFFCORE_RESPONSE:request=PF_L2_RFO: response=ANY_RESPONSE Counts reads for ownership (RFO) requests generated by L2 cache prefetcherhave any transaction responses from the uncore subsystem. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=10020H Offcore
    OFFCORE_RESPONSE:request=PF_L2_RFO: response=L2_HIT Counts reads for ownership (RFO) requests generated by L2 cache prefetcherhit the L2 cache. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=40020H Offcore
    OFFCORE_RESPONSE:request=PF_L2_RFO: response=L2_MISS.SNOOP_MISS_OR_NO_SNOOP_NEEDED Counts reads for ownership (RFO) requests generated by L2 cache prefetchertrue miss for the L2 cache with a snoop miss in the other processor module. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=200000020H Offcore
    OFFCORE_RESPONSE:request=PF_L2_RFO: response=L2_MISS.HITM_OTHER_CORE Counts reads for ownership (RFO) requests generated by L2 cache prefetchermiss the L2 cache with a snoop hit in the other processor module, data forwarding is required. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=1000000020H Offcore
    OFFCORE_RESPONSE:request=PF_L2_RFO: response=OUTSTANDING Counts reads for ownership (RFO) requests generated by L2 cache prefetcheroutstanding, per cycle, from the time of the L2 miss to when any response is received. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx(1A6H)=4000000020H Offcore
    OFFCORE_RESPONSE:request=BUS_LOCKS: response=ANY_RESPONSE Counts bus lock and split lock requestshave any transaction responses from the uncore subsystem. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=10400H Offcore
    OFFCORE_RESPONSE:request=BUS_LOCKS: response=L2_HIT Counts bus lock and split lock requestshit the L2 cache. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=40400H Offcore
    OFFCORE_RESPONSE:request=BUS_LOCKS: response=L2_MISS.SNOOP_MISS_OR_NO_SNOOP_NEEDED Counts bus lock and split lock requeststrue miss for the L2 cache with a snoop miss in the other processor module. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=200000400H Offcore
    OFFCORE_RESPONSE:request=BUS_LOCKS: response=L2_MISS.HITM_OTHER_CORE Counts bus lock and split lock requestsmiss the L2 cache with a snoop hit in the other processor module, data forwarding is required. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=1000000400H Offcore
    OFFCORE_RESPONSE:request=BUS_LOCKS: response=OUTSTANDING Counts bus lock and split lock requestsoutstanding, per cycle, from the time of the L2 miss to when any response is received. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx(1A6H)=4000000400H Offcore
    OFFCORE_RESPONSE:request=FULL_STREAMING_STORES: response=ANY_RESPONSE Counts full cache line data writes to uncacheable write combining (USWC) memory region and full cache-line non-temporal writeshave any transaction responses from the uncore subsystem. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=10800H Offcore
    OFFCORE_RESPONSE:request=FULL_STREAMING_STORES: response=L2_HIT Counts full cache line data writes to uncacheable write combining (USWC) memory region and full cache-line non-temporal writeshit the L2 cache. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=40800H Offcore
    OFFCORE_RESPONSE:request=FULL_STREAMING_STORES: response=L2_MISS.SNOOP_MISS_OR_NO_SNOOP_NEEDED Counts full cache line data writes to uncacheable write combining (USWC) memory region and full cache-line non-temporal writestrue miss for the L2 cache with a snoop miss in the other processor module. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=200000800H Offcore
    OFFCORE_RESPONSE:request=FULL_STREAMING_STORES: response=L2_MISS.HITM_OTHER_CORE Counts full cache line data writes to uncacheable write combining (USWC) memory region and full cache-line non-temporal writesmiss the L2 cache with a snoop hit in the other processor module, data forwarding is required. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=1000000800H Offcore
    OFFCORE_RESPONSE:request=FULL_STREAMING_STORES: response=OUTSTANDING Counts full cache line data writes to uncacheable write combining (USWC) memory region and full cache-line non-temporal writesoutstanding, per cycle, from the time of the L2 miss to when any response is received. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx(1A6H)=4000000800H Offcore
    OFFCORE_RESPONSE:request=SW_PREFETCH: response=ANY_RESPONSE Counts data cache lines requests by software prefetch instructionshave any transaction responses from the uncore subsystem. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=11000H Offcore
    OFFCORE_RESPONSE:request=SW_PREFETCH: response=L2_HIT Counts data cache lines requests by software prefetch instructionshit the L2 cache. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=41000H Offcore
    OFFCORE_RESPONSE:request=SW_PREFETCH: response=L2_MISS.SNOOP_MISS_OR_NO_SNOOP_NEEDED Counts data cache lines requests by software prefetch instructionstrue miss for the L2 cache with a snoop miss in the other processor module. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=200001000H Offcore
    OFFCORE_RESPONSE:request=SW_PREFETCH: response=L2_MISS.HITM_OTHER_CORE Counts data cache lines requests by software prefetch instructionsmiss the L2 cache with a snoop hit in the other processor module, data forwarding is required. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=1000001000H Offcore
    OFFCORE_RESPONSE:request=SW_PREFETCH: response=OUTSTANDING Counts data cache lines requests by software prefetch instructionsoutstanding, per cycle, from the time of the L2 miss to when any response is received. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx(1A6H)=4000001000H Offcore
    OFFCORE_RESPONSE:request=PF_L1_DATA_RD: response=ANY_RESPONSE Counts data cache line reads generated by hardware L1 data cache prefetcherhave any transaction responses from the uncore subsystem. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=12000H Offcore
    OFFCORE_RESPONSE:request=PF_L1_DATA_RD: response=L2_HIT Counts data cache line reads generated by hardware L1 data cache prefetcherhit the L2 cache. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=42000H Offcore
    OFFCORE_RESPONSE:request=PF_L1_DATA_RD: response=L2_MISS.SNOOP_MISS_OR_NO_SNOOP_NEEDED Counts data cache line reads generated by hardware L1 data cache prefetchertrue miss for the L2 cache with a snoop miss in the other processor module. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=200002000H Offcore
    OFFCORE_RESPONSE:request=PF_L1_DATA_RD: response=L2_MISS.HITM_OTHER_CORE Counts data cache line reads generated by hardware L1 data cache prefetchermiss the L2 cache with a snoop hit in the other processor module, data forwarding is required. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=1000002000H Offcore
    OFFCORE_RESPONSE:request=PF_L1_DATA_RD: response=OUTSTANDING Counts data cache line reads generated by hardware L1 data cache prefetcheroutstanding, per cycle, from the time of the L2 miss to when any response is received. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx(1A6H)=4000002000H Offcore
    OFFCORE_RESPONSE:request=STREAMING_STORES: response=ANY_RESPONSE Counts any data writes to uncacheable write combining (USWC) memory regionhave any transaction responses from the uncore subsystem. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=14800H Offcore
    OFFCORE_RESPONSE:request=STREAMING_STORES: response=L2_HIT Counts any data writes to uncacheable write combining (USWC) memory regionhit the L2 cache. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=44800H Offcore
    OFFCORE_RESPONSE:request=STREAMING_STORES: response=L2_MISS.SNOOP_MISS_OR_NO_SNOOP_NEEDED Counts any data writes to uncacheable write combining (USWC) memory regiontrue miss for the L2 cache with a snoop miss in the other processor module. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=200004800H Offcore
    OFFCORE_RESPONSE:request=STREAMING_STORES: response=L2_MISS.HITM_OTHER_CORE Counts any data writes to uncacheable write combining (USWC) memory regionmiss the L2 cache with a snoop hit in the other processor module, data forwarding is required. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=1000004800H Offcore
    OFFCORE_RESPONSE:request=STREAMING_STORES: response=OUTSTANDING Counts any data writes to uncacheable write combining (USWC) memory regionoutstanding, per cycle, from the time of the L2 miss to when any response is received. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx(1A6H)=4000004800H Offcore
    OFFCORE_RESPONSE:request=ANY_REQUEST: response=ANY_RESPONSE Counts requests to the uncore subsystemhave any transaction responses from the uncore subsystem. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=18000H Offcore
    OFFCORE_RESPONSE:request=ANY_REQUEST: response=L2_HIT Counts requests to the uncore subsystemhit the L2 cache. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=48000H Offcore
    OFFCORE_RESPONSE:request=ANY_REQUEST: response=L2_MISS.SNOOP_MISS_OR_NO_SNOOP_NEEDED Counts requests to the uncore subsystemtrue miss for the L2 cache with a snoop miss in the other processor module. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=200008000H Offcore
    OFFCORE_RESPONSE:request=ANY_REQUEST: response=L2_MISS.HITM_OTHER_CORE Counts requests to the uncore subsystemmiss the L2 cache with a snoop hit in the other processor module, data forwarding is required. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=1000008000H Offcore
    OFFCORE_RESPONSE:request=ANY_REQUEST: response=OUTSTANDING Counts requests to the uncore subsystemoutstanding, per cycle, from the time of the L2 miss to when any response is received. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx(1A6H)=4000008000H Offcore
    OFFCORE_RESPONSE:request=ANY_PF_DATA_RD: response=ANY_RESPONSE Counts data reads generated by L1 or L2 prefetchershave any transaction responses from the uncore subsystem. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=13010H Offcore
    OFFCORE_RESPONSE:request=ANY_PF_DATA_RD: response=L2_HIT Counts data reads generated by L1 or L2 prefetchershit the L2 cache. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=43010H Offcore
    OFFCORE_RESPONSE:request=ANY_PF_DATA_RD: response=L2_MISS.SNOOP_MISS_OR_NO_SNOOP_NEEDED Counts data reads generated by L1 or L2 prefetcherstrue miss for the L2 cache with a snoop miss in the other processor module. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=200003010H Offcore
    OFFCORE_RESPONSE:request=ANY_PF_DATA_RD: response=L2_MISS.HITM_OTHER_CORE Counts data reads generated by L1 or L2 prefetchersmiss the L2 cache with a snoop hit in the other processor module, data forwarding is required. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=1000003010H Offcore
    OFFCORE_RESPONSE:request=ANY_PF_DATA_RD: response=OUTSTANDING Counts data reads generated by L1 or L2 prefetchersoutstanding, per cycle, from the time of the L2 miss to when any response is received. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx(1A6H)=4000003010H Offcore
    OFFCORE_RESPONSE:request=ANY_DATA_RD: response=ANY_RESPONSE Counts data reads (demand & prefetch)have any transaction responses from the uncore subsystem. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=13091H Offcore
    OFFCORE_RESPONSE:request=ANY_DATA_RD: response=L2_HIT Counts data reads (demand & prefetch)hit the L2 cache. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=43091H Offcore
    OFFCORE_RESPONSE:request=ANY_DATA_RD: response=L2_MISS.SNOOP_MISS_OR_NO_SNOOP_NEEDED Counts data reads (demand & prefetch)true miss for the L2 cache with a snoop miss in the other processor module. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=200003091H Offcore
    OFFCORE_RESPONSE:request=ANY_DATA_RD: response=L2_MISS.HITM_OTHER_CORE Counts data reads (demand & prefetch)miss the L2 cache with a snoop hit in the other processor module, data forwarding is required. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=1000003091H Offcore
    OFFCORE_RESPONSE:request=ANY_DATA_RD: response=OUTSTANDING Counts data reads (demand & prefetch)outstanding, per cycle, from the time of the L2 miss to when any response is received. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx(1A6H)=4000003091H Offcore
    OFFCORE_RESPONSE:request=ANY_RFO: response=ANY_RESPONSE Counts reads for ownership (RFO) requests (demand & prefetch)have any transaction responses from the uncore subsystem. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=10022H Offcore
    OFFCORE_RESPONSE:request=ANY_RFO: response=L2_HIT Counts reads for ownership (RFO) requests (demand & prefetch)hit the L2 cache. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=40022H Offcore
    OFFCORE_RESPONSE:request=ANY_RFO: response=L2_MISS.SNOOP_MISS_OR_NO_SNOOP_NEEDED Counts reads for ownership (RFO) requests (demand & prefetch)true miss for the L2 cache with a snoop miss in the other processor module. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=200000022H Offcore
    OFFCORE_RESPONSE:request=ANY_RFO: response=L2_MISS.HITM_OTHER_CORE Counts reads for ownership (RFO) requests (demand & prefetch)miss the L2 cache with a snoop hit in the other processor module, data forwarding is required. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=1000000022H Offcore
    OFFCORE_RESPONSE:request=ANY_RFO: response=OUTSTANDING Counts reads for ownership (RFO) requests (demand & prefetch)outstanding, per cycle, from the time of the L2 miss to when any response is received. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx(1A6H)=4000000022H Offcore
    OFFCORE_RESPONSE:request=ANY_READ: response=ANY_RESPONSE Counts data read, code read, and read for ownership (RFO) requests (demand & prefetch)have any transaction responses from the uncore subsystem. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=132B7H Offcore
    OFFCORE_RESPONSE:request=ANY_READ: response=L2_HIT Counts data read, code read, and read for ownership (RFO) requests (demand & prefetch)hit the L2 cache. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=432B7H Offcore
    OFFCORE_RESPONSE:request=ANY_READ: response=L2_MISS.SNOOP_MISS_OR_NO_SNOOP_NEEDED Counts data read, code read, and read for ownership (RFO) requests (demand & prefetch)true miss for the L2 cache with a snoop miss in the other processor module. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=2000032B7H Offcore
    OFFCORE_RESPONSE:request=ANY_READ: response=L2_MISS.HITM_OTHER_CORE Counts data read, code read, and read for ownership (RFO) requests (demand & prefetch)miss the L2 cache with a snoop hit in the other processor module, data forwarding is required. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx{1A6H,1A7H}=10000032B7H Offcore
    OFFCORE_RESPONSE:request=ANY_READ: response=OUTSTANDING Counts data read, code read, and read for ownership (RFO) requests (demand & prefetch)outstanding, per cycle, from the time of the L2 miss to when any response is received. EventSel=(B7H) UMask={01H,02H} MSR_OFFCORE_RSPx(1A6H)=40000032B7H Offcore