15bd8deadSopenharmony_ciName 25bd8deadSopenharmony_ci 35bd8deadSopenharmony_ci INTEL_performance_query 45bd8deadSopenharmony_ci 55bd8deadSopenharmony_ciName Strings 65bd8deadSopenharmony_ci 75bd8deadSopenharmony_ci GL_INTEL_performance_query 85bd8deadSopenharmony_ci 95bd8deadSopenharmony_ciContact 105bd8deadSopenharmony_ci 115bd8deadSopenharmony_ci Tomasz Madajczak, Intel (tomasz.madajczak 'at' intel.com) 125bd8deadSopenharmony_ci 135bd8deadSopenharmony_ciContributors 145bd8deadSopenharmony_ci 155bd8deadSopenharmony_ci Piotr Uminski, Intel 165bd8deadSopenharmony_ci Slawomir Grajewski, Intel 175bd8deadSopenharmony_ci 185bd8deadSopenharmony_ciStatus 195bd8deadSopenharmony_ci 205bd8deadSopenharmony_ci Complete, shipping on selected Intel graphics. 215bd8deadSopenharmony_ci 225bd8deadSopenharmony_ciVersion 235bd8deadSopenharmony_ci 245bd8deadSopenharmony_ci Last Modified Date: December 20, 2013 255bd8deadSopenharmony_ci Revision: 3 265bd8deadSopenharmony_ci 275bd8deadSopenharmony_ciNumber 285bd8deadSopenharmony_ci 295bd8deadSopenharmony_ci OpenGL Extension #443 305bd8deadSopenharmony_ci OpenGL ES Extension #164 315bd8deadSopenharmony_ci 325bd8deadSopenharmony_ciDependencies 335bd8deadSopenharmony_ci 345bd8deadSopenharmony_ci OpenGL dependencies: 355bd8deadSopenharmony_ci 365bd8deadSopenharmony_ci OpenGL 3.0 is required. 375bd8deadSopenharmony_ci 385bd8deadSopenharmony_ci The extension is written against the OpenGL 4.4 Specification, Core 395bd8deadSopenharmony_ci Profile, October 18, 2013. 405bd8deadSopenharmony_ci 415bd8deadSopenharmony_ci OpenGL ES dependencies: 425bd8deadSopenharmony_ci 435bd8deadSopenharmony_ci This extension is written against the OpenGL ES 2.0.25 Specification 445bd8deadSopenharmony_ci and OpenGL ES 3.0.2 Specification. 455bd8deadSopenharmony_ci 465bd8deadSopenharmony_ciOverview 475bd8deadSopenharmony_ci 485bd8deadSopenharmony_ci The purpose of this extension is to expose Intel proprietary hardware 495bd8deadSopenharmony_ci performance counters to the OpenGL applications. Performance counters may 505bd8deadSopenharmony_ci count: 515bd8deadSopenharmony_ci 525bd8deadSopenharmony_ci - number of hardware events such as number of spawned vertex shaders. In 535bd8deadSopenharmony_ci this case the results represent the number of events. 545bd8deadSopenharmony_ci 555bd8deadSopenharmony_ci - duration of certain activity, like time took by all fragment shader 565bd8deadSopenharmony_ci invocations. In that case the result usually represents the number of 575bd8deadSopenharmony_ci clocks in which the particular HW unit was busy. In order to use such 585bd8deadSopenharmony_ci counter efficiently, it should be normalized to the range of <0,1> by 595bd8deadSopenharmony_ci dividing its value by the number of render clocks. 605bd8deadSopenharmony_ci 615bd8deadSopenharmony_ci - used throughput of certain memory types such as texture memory. In that 625bd8deadSopenharmony_ci case the result of performance counter usually represents the number of 635bd8deadSopenharmony_ci bytes transferred between GPU and memory. 645bd8deadSopenharmony_ci 655bd8deadSopenharmony_ci This extension specifies universal API to manage performance counters on 665bd8deadSopenharmony_ci different Intel hardware platforms. Performance counters are grouped 675bd8deadSopenharmony_ci together into proprietary, hardware-specific, fixed sets of counters that 685bd8deadSopenharmony_ci are measured together by the GPU. 695bd8deadSopenharmony_ci 705bd8deadSopenharmony_ci It is assumed that performance counters are started and ended on any 715bd8deadSopenharmony_ci arbitrary boundaries during rendering. 725bd8deadSopenharmony_ci 735bd8deadSopenharmony_ci A set of performance counters is represented by a unique query type. Each 745bd8deadSopenharmony_ci query type is identified by assigned name and ID. Multiple query types 755bd8deadSopenharmony_ci (sets of performance counters) are supported by the Intel hardware. However 765bd8deadSopenharmony_ci each Intel hardware generation supports different sets of performance 775bd8deadSopenharmony_ci counters. Therefore the query types between hardware generations can be 785bd8deadSopenharmony_ci different. The definition of query types and their results structures can 795bd8deadSopenharmony_ci be learned through the API. It is also documented in a separate document of 805bd8deadSopenharmony_ci Intel OGL Performance Counters Specification issued per each new hardware 815bd8deadSopenharmony_ci generation. 825bd8deadSopenharmony_ci 835bd8deadSopenharmony_ci The API allows to create multiple instances of any query type and to sample 845bd8deadSopenharmony_ci different fragments of 3D rendering with such instances. Query instances 855bd8deadSopenharmony_ci are identified with handles. 865bd8deadSopenharmony_ci 875bd8deadSopenharmony_ciNew Procedures and Functions 885bd8deadSopenharmony_ci 895bd8deadSopenharmony_ci void GetFirstPerfQueryIdINTEL(uint *queryId); 905bd8deadSopenharmony_ci 915bd8deadSopenharmony_ci void GetNextPerfQueryIdINTEL(uint queryId, uint *nextQueryId); 925bd8deadSopenharmony_ci 935bd8deadSopenharmony_ci void GetPerfQueryIdByNameINTEL(char *queryName, uint *queryId); 945bd8deadSopenharmony_ci 955bd8deadSopenharmony_ci void GetPerfQueryInfoINTEL(uint queryId, 965bd8deadSopenharmony_ci uint queryNameLength, char *queryName, 975bd8deadSopenharmony_ci uint *dataSize, uint *noCounters, 985bd8deadSopenharmony_ci uint *noInstances, uint *capsMask); 995bd8deadSopenharmony_ci 1005bd8deadSopenharmony_ci void GetPerfCounterInfoINTEL(uint queryId, uint counterId, 1015bd8deadSopenharmony_ci uint counterNameLength, char *counterName, 1025bd8deadSopenharmony_ci uint counterDescLength, char *counterDesc, 1035bd8deadSopenharmony_ci uint *counterOffset, uint *counterDataSize, uint *counterTypeEnum, 1045bd8deadSopenharmony_ci uint *counterDataTypeEnum, uint64 *rawCounterMaxValue); 1055bd8deadSopenharmony_ci 1065bd8deadSopenharmony_ci void CreatePerfQueryINTEL(uint queryId, uint *queryHandle); 1075bd8deadSopenharmony_ci 1085bd8deadSopenharmony_ci void DeletePerfQueryINTEL(uint queryHandle); 1095bd8deadSopenharmony_ci 1105bd8deadSopenharmony_ci void BeginPerfQueryINTEL(uint queryHandle); 1115bd8deadSopenharmony_ci 1125bd8deadSopenharmony_ci void EndPerfQueryINTEL(uint queryHandle); 1135bd8deadSopenharmony_ci 1145bd8deadSopenharmony_ci void GetPerfQueryDataINTEL(uint queryHandle, uint flags, 1155bd8deadSopenharmony_ci sizei dataSize, void *data, uint *bytesWritten); 1165bd8deadSopenharmony_ci 1175bd8deadSopenharmony_ciNew Tokens 1185bd8deadSopenharmony_ci 1195bd8deadSopenharmony_ci Returned by the capsMask parameter of GetPerfQueryInfoINTEL 1205bd8deadSopenharmony_ci 1215bd8deadSopenharmony_ci PERFQUERY_SINGLE_CONTEXT_INTEL 0x0000 1225bd8deadSopenharmony_ci PERFQUERY_GLOBAL_CONTEXT_INTEL 0x0001 1235bd8deadSopenharmony_ci 1245bd8deadSopenharmony_ci Accepted by the flags parameter of GetPerfQueryDataINTEL 1255bd8deadSopenharmony_ci 1265bd8deadSopenharmony_ci PERFQUERY_WAIT_INTEL 0x83FB 1275bd8deadSopenharmony_ci PERFQUERY_FLUSH_INTEL 0x83FA 1285bd8deadSopenharmony_ci PERFQUERY_DONOT_FLUSH_INTEL 0x83F9 1295bd8deadSopenharmony_ci 1305bd8deadSopenharmony_ci Returned by GetPerfCounterInfoINTEL function as counter type enumeration in 1315bd8deadSopenharmony_ci location pointed by counterTypeEnum 1325bd8deadSopenharmony_ci 1335bd8deadSopenharmony_ci PERFQUERY_COUNTER_EVENT_INTEL 0x94F0 1345bd8deadSopenharmony_ci PERFQUERY_COUNTER_DURATION_NORM_INTEL 0x94F1 1355bd8deadSopenharmony_ci PERFQUERY_COUNTER_DURATION_RAW_INTEL 0x94F2 1365bd8deadSopenharmony_ci PERFQUERY_COUNTER_THROUGHPUT_INTEL 0x94F3 1375bd8deadSopenharmony_ci PERFQUERY_COUNTER_RAW_INTEL 0x94F4 1385bd8deadSopenharmony_ci PERFQUERY_COUNTER_TIMESTAMP_INTEL 0x94F5 1395bd8deadSopenharmony_ci 1405bd8deadSopenharmony_ci Returned by glGetPerfCounterInfoINTEL function as counter data type 1415bd8deadSopenharmony_ci enumeration in location pointed by counterDataTypeEnum 1425bd8deadSopenharmony_ci 1435bd8deadSopenharmony_ci PERFQUERY_COUNTER_DATA_UINT32_INTEL 0x94F8 1445bd8deadSopenharmony_ci PERFQUERY_COUNTER_DATA_UINT64_INTEL 0x94F9 1455bd8deadSopenharmony_ci PERFQUERY_COUNTER_DATA_FLOAT_INTEL 0x94FA 1465bd8deadSopenharmony_ci PERFQUERY_COUNTER_DATA_DOUBLE_INTEL 0x94FB 1475bd8deadSopenharmony_ci PERFQUERY_COUNTER_DATA_BOOL32_INTEL 0x94FC 1485bd8deadSopenharmony_ci 1495bd8deadSopenharmony_ci Accepted by the <pname> parameter of GetIntegerv: 1505bd8deadSopenharmony_ci 1515bd8deadSopenharmony_ci PERFQUERY_QUERY_NAME_LENGTH_MAX_INTEL 0x94FD 1525bd8deadSopenharmony_ci PERFQUERY_COUNTER_NAME_LENGTH_MAX_INTEL 0x94FE 1535bd8deadSopenharmony_ci PERFQUERY_COUNTER_DESC_LENGTH_MAX_INTEL 0x94FF 1545bd8deadSopenharmony_ci 1555bd8deadSopenharmony_ci Accepted by the <pname> parameter of GetBooleanv: 1565bd8deadSopenharmony_ci 1575bd8deadSopenharmony_ci PERFQUERY_GPA_EXTENDED_COUNTERS_INTEL 0x9500 1585bd8deadSopenharmony_ci 1595bd8deadSopenharmony_ciAdd new Section 4.4 to Chapter 4, Event Model for OpenGL 4.4 1605bd8deadSopenharmony_ciAdd new Section 2.18 to Chapter 2, OpenGL ES Operation for OpenGL ES 3.0.2 1615bd8deadSopenharmony_ci 1625bd8deadSopenharmony_ci 4.4 Performance Queries (for OpenGL 4.4) 1635bd8deadSopenharmony_ci 2.18 Performance Queries (for OpenGL ES 3.0.2) 1645bd8deadSopenharmony_ci 1655bd8deadSopenharmony_ci Hardware and software performance counters can be used to obtain 1665bd8deadSopenharmony_ci information about GPU activity. Performance counters are grouped into query 1675bd8deadSopenharmony_ci types. Different query types can be supported on different hardware 1685bd8deadSopenharmony_ci platforms and/or driver versions. One or more instances of the query types 1695bd8deadSopenharmony_ci can be created. 1705bd8deadSopenharmony_ci 1715bd8deadSopenharmony_ci Each query type has unique query ID. Query ids supported on given platform 1725bd8deadSopenharmony_ci can be queried in the run-time. Function: 1735bd8deadSopenharmony_ci 1745bd8deadSopenharmony_ci void GetFirstPerfQueryIdINTEL(uint *queryId); 1755bd8deadSopenharmony_ci 1765bd8deadSopenharmony_ci returns the identifier of the first performance query type that is 1775bd8deadSopenharmony_ci supported on a given platform. The result is passed in location pointed by 1785bd8deadSopenharmony_ci queryId parameter. If the given hardware platform doesn't support any 1795bd8deadSopenharmony_ci performance queries, then the value of 0 is returned and INVALID_OPERATION 1805bd8deadSopenharmony_ci error is raised. If queryId pointer is equal to 0, INVALID_VALUE error is 1815bd8deadSopenharmony_ci generated. 1825bd8deadSopenharmony_ci 1835bd8deadSopenharmony_ci Next query ids can be queried by multiply call to the function: 1845bd8deadSopenharmony_ci 1855bd8deadSopenharmony_ci void GetNextPerfQueryIdINTEL(uint queryId, uint *nextQueryId); 1865bd8deadSopenharmony_ci 1875bd8deadSopenharmony_ci This function returns the integer identifier of the next performance query 1885bd8deadSopenharmony_ci on a given platform to the specified with queryId. The result is passed in 1895bd8deadSopenharmony_ci location pointed by nextQueryId. If query identified by queryId is the last 1905bd8deadSopenharmony_ci query available the value of 0 is returned. If the specified performance 1915bd8deadSopenharmony_ci query identifier is invalid then INVALID_VALUE error is generated. If 1925bd8deadSopenharmony_ci nextQueryId pointer is equal to 0, an INVALID_VALUE error is 1935bd8deadSopenharmony_ci generated. Whenever error is generated, the value of 0 is returned. 1945bd8deadSopenharmony_ci 1955bd8deadSopenharmony_ci Each performance query type has a name and a unique identifier. The query 1965bd8deadSopenharmony_ci identifier for a given query name be read using function: 1975bd8deadSopenharmony_ci 1985bd8deadSopenharmony_ci void GetPerfQueryIdByNameINTEL(char *queryName, uint *queryId); 1995bd8deadSopenharmony_ci 2005bd8deadSopenharmony_ci This function returns the identified of the query type specified by the 2015bd8deadSopenharmony_ci string provided as queryName parameter. If queryName does not reference a 2025bd8deadSopenharmony_ci valid query name, an INVALID_VALUE error is generated. 2035bd8deadSopenharmony_ci 2045bd8deadSopenharmony_ci General description of a query type can be read using the function: 2055bd8deadSopenharmony_ci 2065bd8deadSopenharmony_ci void GetPerfQueryInfoINTEL(uint queryId, uint queryNameLength, 2075bd8deadSopenharmony_ci char *queryName, uint *dataSize, 2085bd8deadSopenharmony_ci uint *noCounters, uint *maxInstances, 2095bd8deadSopenharmony_ci uint *noActiveInstances, uint *capsMask); 2105bd8deadSopenharmony_ci 2115bd8deadSopenharmony_ci The function returns information about the performance query specified with 2125bd8deadSopenharmony_ci queryId parameter, particularly: 2135bd8deadSopenharmony_ci 2145bd8deadSopenharmony_ci - query name in queryName location. The maximal name is specified by 2155bd8deadSopenharmony_ci queryNameLength 2165bd8deadSopenharmony_ci 2175bd8deadSopenharmony_ci - size of query output structure in bytes in dataSize location 2185bd8deadSopenharmony_ci 2195bd8deadSopenharmony_ci - number of performance counters in the query output structure in 2205bd8deadSopenharmony_ci noCounters location 2215bd8deadSopenharmony_ci 2225bd8deadSopenharmony_ci - the maximal allowed number of query instances that can be created on a 2235bd8deadSopenharmony_ci given architecture in maxInstances location. Because the other type queries 2245bd8deadSopenharmony_ci are created using the same resources, it may happen that the actual amount 2255bd8deadSopenharmony_ci of created instances is smaller than the returned number 2265bd8deadSopenharmony_ci 2275bd8deadSopenharmony_ci - the actual number of already created query instances in maxInstances 2285bd8deadSopenharmony_ci location 2295bd8deadSopenharmony_ci 2305bd8deadSopenharmony_ci - mask of query capabilities in capsMask location. 2315bd8deadSopenharmony_ci 2325bd8deadSopenharmony_ci If the mask returned in capsMask contains PERFQUERY_SINGLE_CONTEXT_INTEL 2335bd8deadSopenharmony_ci token this means the query supports context sensitive measurements, 2345bd8deadSopenharmony_ci otherwise, if the mask contains token of GL_PERFQUERY_GLOBAL_CONTEXT_INTEL 2355bd8deadSopenharmony_ci this means the query doesn't support that feature and the counters will be 2365bd8deadSopenharmony_ci updated for all render contexts as they are global for hardware. 2375bd8deadSopenharmony_ci 2385bd8deadSopenharmony_ci If queryId does not reference a valid query type, an INVALID_VALUE error is 2395bd8deadSopenharmony_ci generated. 2405bd8deadSopenharmony_ci 2415bd8deadSopenharmony_ci Performance counters that belong to the same query type have unique 2425bd8deadSopenharmony_ci ids. Performance counter ids values start with 1. Performance counter id 0 2435bd8deadSopenharmony_ci is reserved as an invalid counter. Information about performance counters 2445bd8deadSopenharmony_ci that belongs to a given query type can be read using the function: 2455bd8deadSopenharmony_ci 2465bd8deadSopenharmony_ci void GetPerfCounterInfoINTEL(uint queryId, uint counterId, 2475bd8deadSopenharmony_ci uint counterNameLength, char *counterName, 2485bd8deadSopenharmony_ci uint counterDescLength, char *counterDesc, 2495bd8deadSopenharmony_ci uint *counterOffset, uint *counterDataSize, uint *counterTypeEnum, 2505bd8deadSopenharmony_ci uint *counterDataTypeEnum, uint64 *rawCounterMaxValue); 2515bd8deadSopenharmony_ci 2525bd8deadSopenharmony_ci The function returns descriptive information about each particular 2535bd8deadSopenharmony_ci performance counter that is an element of the performance query. The 2545bd8deadSopenharmony_ci counter is identified with a pair of queryId and counterId parameters. The 2555bd8deadSopenharmony_ci following parameters are returned: 2565bd8deadSopenharmony_ci 2575bd8deadSopenharmony_ci - counter name in counterName location. The maximal length of copied name 2585bd8deadSopenharmony_ci is specified with counterNameLength. 2595bd8deadSopenharmony_ci 2605bd8deadSopenharmony_ci - counter description text in counterDesc location. The maximal length of 2615bd8deadSopenharmony_ci copied text is specified with counterDescLength. 2625bd8deadSopenharmony_ci 2635bd8deadSopenharmony_ci - byte offset of the counter from the start of the query structure in 2645bd8deadSopenharmony_ci counterOffset location. 2655bd8deadSopenharmony_ci 2665bd8deadSopenharmony_ci - counter size in bytes in counterDataSize location. 2675bd8deadSopenharmony_ci 2685bd8deadSopenharmony_ci - counter type enumeration in counterTypeEnum location. It can be one o 2695bd8deadSopenharmony_ci the following tokens: 2705bd8deadSopenharmony_ci PERFQUERY_COUNTER_EVENT_INTEL 2715bd8deadSopenharmony_ci PERFQUERY_COUNTER_DURATION_NORM_INTEL 2725bd8deadSopenharmony_ci PERFQUERY_COUNTER_DURATION_RAW_INTEL 2735bd8deadSopenharmony_ci PERFQUERY_COUNTER_THROUGHPUT_INTEL 2745bd8deadSopenharmony_ci PERFQUERY_COUNTER_RAW_INTEL 2755bd8deadSopenharmony_ci PERFQUERY_COUNTER_TIMESTAMP_INTEL 2765bd8deadSopenharmony_ci 2775bd8deadSopenharmony_ci - counter data type enumeration, in counterDataTypeEnum location. It can 2785bd8deadSopenharmony_ci be one o the following tokens: 2795bd8deadSopenharmony_ci PERFQUERY_COUNTER_DATA_UINT32_INTEL 2805bd8deadSopenharmony_ci PERFQUERY_COUNTER_DATA_UINT64_INTEL 2815bd8deadSopenharmony_ci PERFQUERY_COUNTER_DATA_FLOAT_INTEL 2825bd8deadSopenharmony_ci PERFQUERY_COUNTER_DATA_DOUBLE_INTEL 2835bd8deadSopenharmony_ci PERFQUERY_COUNTER_DATA_BOOL32_INTEL 2845bd8deadSopenharmony_ci 2855bd8deadSopenharmony_ci - for some raw counters for which the maximal value is deterministic, the 2865bd8deadSopenharmony_ci maximal value of the counter in 1 second is returned in the location 2875bd8deadSopenharmony_ci pointed by rawCounterMaxValue, otherwise, the location is written with 2885bd8deadSopenharmony_ci the value of 0. 2895bd8deadSopenharmony_ci 2905bd8deadSopenharmony_ci If the pair of queryId and counterId does not reference a valid counter, 2915bd8deadSopenharmony_ci an INVALID_VALUE error is generated. 2925bd8deadSopenharmony_ci 2935bd8deadSopenharmony_ci A single instance of the performance query of a given type can be created 2945bd8deadSopenharmony_ci using function: 2955bd8deadSopenharmony_ci 2965bd8deadSopenharmony_ci void CreatePerfQueryINTEL(uint queryId, uint *queryHandle); 2975bd8deadSopenharmony_ci 2985bd8deadSopenharmony_ci The handle to newly created query instance is returned in queryHandle 2995bd8deadSopenharmony_ci location. If queryId does not reference a valid query type, 3005bd8deadSopenharmony_ci an INVALID_VALUE error is generated. If the query instance cannot be 3015bd8deadSopenharmony_ci created due to exceeding the number of allowed instances or driver fails 3025bd8deadSopenharmony_ci query creation due to an insufficient memory reason, an OUT_OF_MEMORY error 3035bd8deadSopenharmony_ci is generated, and the location pointed by queryHandle returns NULL. 3045bd8deadSopenharmony_ci Existing query instance can be deleted using function 3055bd8deadSopenharmony_ci 3065bd8deadSopenharmony_ci void DeletePerfQueryINTEL(uint queryHandle); 3075bd8deadSopenharmony_ci 3085bd8deadSopenharmony_ci queryHandle must be a query instance handle returned by 3095bd8deadSopenharmony_ci CreatePerfQueryINTEL(). If a query handle doesn't reference a previously 3105bd8deadSopenharmony_ci created performance query instance, an INVALID_VALUE error is generated. 3115bd8deadSopenharmony_ci 3125bd8deadSopenharmony_ci A new measurement session for a given query instance can be started using 3135bd8deadSopenharmony_ci function: 3145bd8deadSopenharmony_ci 3155bd8deadSopenharmony_ci void BeginPerfQueryINTEL(uint queryHandle); 3165bd8deadSopenharmony_ci 3175bd8deadSopenharmony_ci where queryHandle must be a query instance handle returned by 3185bd8deadSopenharmony_ci CreatePerfQueryINTEL(). If a query handle doesn't reference a previously 3195bd8deadSopenharmony_ci created performance query instance, an INVALID_VALUE error is 3205bd8deadSopenharmony_ci generated. Note that some query types, they cannot be collected in the same 3215bd8deadSopenharmony_ci time. Therefore calls of BeginPerfQueryINTEL() cannot be nested if they 3225bd8deadSopenharmony_ci refer to queries of such different types. In such case INVALID_OPERATION 3235bd8deadSopenharmony_ci error is generated. 3245bd8deadSopenharmony_ci 3255bd8deadSopenharmony_ci The counters may not start immediately after BeginPerfQueryINTEL(). 3265bd8deadSopenharmony_ci Because the API and GPU are asynchronous, the start of performance counters 3275bd8deadSopenharmony_ci is delayed until the graphics hardware actually executes the hardware 3285bd8deadSopenharmony_ci commands issued by this function. However, it is guaranteed that collecting 3295bd8deadSopenharmony_ci of performance counters will start before any draw calls specified in the 3305bd8deadSopenharmony_ci same context after call to BeginPerfQueryINTEL(). 3315bd8deadSopenharmony_ci 3325bd8deadSopenharmony_ci Collecting performance counters may be stopped by a function: 3335bd8deadSopenharmony_ci 3345bd8deadSopenharmony_ci void EndPerfQueryINTEL(uint queryHandle); 3355bd8deadSopenharmony_ci 3365bd8deadSopenharmony_ci where queryHandle must be a query instance handle returned by 3375bd8deadSopenharmony_ci CreatePerfQueryINTEL(). The function ends the measurement session started 3385bd8deadSopenharmony_ci by BeginPerfQueryINTEL(). If a performance query is not currently started, 3395bd8deadSopenharmony_ci an INVALID_OPERATION error will be generated. Similarly as in 3405bd8deadSopenharmony_ci glBeginPerfQueryINTEL() case, the execution of glEndPerfQueryINTEL() is not 3415bd8deadSopenharmony_ci immediate. The end of measurement is delayed until graphics hardware 3425bd8deadSopenharmony_ci completes processing of the hardware commands issued by this 3435bd8deadSopenharmony_ci function. However, it is guaranteed that results any draw calls specified in 3445bd8deadSopenharmony_ci the same context after call to EndPerfQueryINTEL() will be not measured by 3455bd8deadSopenharmony_ci this query. 3465bd8deadSopenharmony_ci 3475bd8deadSopenharmony_ci The query result can be read using function: 3485bd8deadSopenharmony_ci 3495bd8deadSopenharmony_ci void GetPerfQueryDataINTEL(uint queryHandle, uint flags, sizei 3505bd8deadSopenharmony_ci dataSize, void *data, uint *bytesWritten); 3515bd8deadSopenharmony_ci 3525bd8deadSopenharmony_ci The function returns the values of counters which have been measured within 3535bd8deadSopenharmony_ci the query session identified by queryHandle. The call may end without 3545bd8deadSopenharmony_ci returning any data if they are not ready for reading as the measurement 3555bd8deadSopenharmony_ci session is still pending (the EndPerfQueryINTEL() command processing is not 3565bd8deadSopenharmony_ci finished by hardware). In this case location pointed by the bytesWritten 3575bd8deadSopenharmony_ci parameter will be set to 0. The meaning of the flags parameter is the 3585bd8deadSopenharmony_ci following: 3595bd8deadSopenharmony_ci 3605bd8deadSopenharmony_ci - PERFQUERY_DONOT_FLUSH_INTEL means that the call of 3615bd8deadSopenharmony_ci GetPerfQueryDataINTEL() is non-blocking, which checks for results and 3625bd8deadSopenharmony_ci returns them if they are available. Otherwise, (if the results of the 3635bd8deadSopenharmony_ci query are not ready) it returns without flushing any outstanding 3D 3645bd8deadSopenharmony_ci commands to the GPU. The use case for this is when a flush of 3655bd8deadSopenharmony_ci outstanding 3D commands to GPU has already been ensured with other 3665bd8deadSopenharmony_ci OpenGL API calls. 3675bd8deadSopenharmony_ci 3685bd8deadSopenharmony_ci - PERFQUERY_FLUSH_INTEL means that the call of GetPerfQueryDataINTEL() is 3695bd8deadSopenharmony_ci non-blocking, which checks for results and returns them if they are 3705bd8deadSopenharmony_ci available. Otherwise, it implicitly submits any outstanding 3D commands 3715bd8deadSopenharmony_ci to the GPU for execution. In that case the subsequent call of 3725bd8deadSopenharmony_ci glGetPerfQueryDataINTEL() may return data once the query completes. 3735bd8deadSopenharmony_ci 3745bd8deadSopenharmony_ci - PERFQUERY_WAIT_INTEL means that the call of GetPerfQueryDataINTEL() is 3755bd8deadSopenharmony_ci blocking and waits till the query results are available and returns 3765bd8deadSopenharmony_ci them. It means that if the query results are not yet available then it 3775bd8deadSopenharmony_ci implicitly submits any outstanding 3D commands to GPU and waits for the 3785bd8deadSopenharmony_ci query completion. 3795bd8deadSopenharmony_ci 3805bd8deadSopenharmony_ci If the measurement session indentified by queryHandle is completed then the 3815bd8deadSopenharmony_ci call of GetPerfQueryDataINTEL() always writes query result to the location 3825bd8deadSopenharmony_ci pointed by the data parameter and the amount of bytes written is stored in 3835bd8deadSopenharmony_ci the location pointed by the bytesWritten parameter. 3845bd8deadSopenharmony_ci 3855bd8deadSopenharmony_ci If bytesWritten or data pointers are NULL then an INVALID_VALUE error is 3865bd8deadSopenharmony_ci generated. 3875bd8deadSopenharmony_ci 3885bd8deadSopenharmony_ci 3895bd8deadSopenharmony_ciNew Implementation Dependent State 3905bd8deadSopenharmony_ci 3915bd8deadSopenharmony_ciAdd new Table 23.75 to Chapter 23, State Tables (OpenGL 4.4) 3925bd8deadSopenharmony_ciAdd new Table 6.37 to Chapter 6.2, State Tables (OpenGL ES 3.0.2) 3935bd8deadSopenharmony_ci 3945bd8deadSopenharmony_ci 3955bd8deadSopenharmony_ci Get Value Type Get Command Value Description 3965bd8deadSopenharmony_ci ------------------------------ ---- ----------- ----- ------------- 3975bd8deadSopenharmony_ci PERFQUERY_QUERY_NAME_LENGTH_MAX_INTEL Z+ GetIntegerv 256 max query name length 3985bd8deadSopenharmony_ci PERFQUERY_COUNTER_NAME_LENGTH_MAX_INTEL Z+ GetIntegerv 256 max counter name length 3995bd8deadSopenharmony_ci PERFQUERY_COUNTER_DESC_LENGTH_MAX_INTEL Z+ GetIntegerv 1024 max description length 4005bd8deadSopenharmony_ci PERFQUERY_GPA_EXTENDED_COUNTERS_INTEL B GetBooleanv - extended counters available 4015bd8deadSopenharmony_ci 4025bd8deadSopenharmony_ci 4035bd8deadSopenharmony_ciIssues 4045bd8deadSopenharmony_ci 4055bd8deadSopenharmony_ci 1. What is the usage model of this extension? 4065bd8deadSopenharmony_ci 4075bd8deadSopenharmony_ci Generally there are two approaches of measuring performance with Intel OGL 4085bd8deadSopenharmony_ci Performance Queries, such as: 4095bd8deadSopenharmony_ci 4105bd8deadSopenharmony_ci - Per draw call measurements - performance counters can be used to assess 4115bd8deadSopenharmony_ci the business of particular 3D hardware units under assumption that 3D 4125bd8deadSopenharmony_ci hardware is almost 100% time busy from the CPU point of view. 4135bd8deadSopenharmony_ci 4145bd8deadSopenharmony_ci - Per 3D scene measurements - performance counters can be used to assess 4155bd8deadSopenharmony_ci the balance of CPU and GPU processing times. Such assessment shows whether 4165bd8deadSopenharmony_ci the workload is CPU whether GPU bound. 4175bd8deadSopenharmony_ci 4185bd8deadSopenharmony_ci 2. How per draw call measurements are performed? 4195bd8deadSopenharmony_ci 4205bd8deadSopenharmony_ci In the per-draw call usage model each call to the draw routine 4215bd8deadSopenharmony_ci (e.g. glDrawArrays, glDrawElements) should be surrounded by a dedicated 4225bd8deadSopenharmony_ci query instance. That means that each draw operation should be measured 4235bd8deadSopenharmony_ci independently. It is recommended to measure the GPU performance 4245bd8deadSopenharmony_ci characteristics for a single draw call to find possible bottlenecks 4255bd8deadSopenharmony_ci for the application executed on a given hardware. 4265bd8deadSopenharmony_ci 4275bd8deadSopenharmony_ci 3. How per scene measurements are performed? 4285bd8deadSopenharmony_ci 4295bd8deadSopenharmony_ci The usage model assumes that one performance query instance measures a 4305bd8deadSopenharmony_ci complete scene. It is recommended to figure out if the workload is CPU 4315bd8deadSopenharmony_ci or GPU bound. It should be noted that: 4325bd8deadSopenharmony_ci 4335bd8deadSopenharmony_ci - For a longer scope of performance query the probability of 3D hardware 4345bd8deadSopenharmony_ci frequency change is higher. The higher probability of frequency change 4355bd8deadSopenharmony_ci causes that the larger percentage of results may be biased with gross 4365bd8deadSopenharmony_ci errors. 4375bd8deadSopenharmony_ci 4385bd8deadSopenharmony_ci - For complicated 3D scenes the condition of render commands split is 4395bd8deadSopenharmony_ci always met. 4405bd8deadSopenharmony_ci 4415bd8deadSopenharmony_ci Thus, to calculate an average 3D hardware unit utilization for a longer 4425bd8deadSopenharmony_ci period of time it is recommended to use a larger number of per draw call 4435bd8deadSopenharmony_ci queries rather than a lower number of per 3D scene queries. It is 4445bd8deadSopenharmony_ci recommended to use this method when application uses full screen mode as 4455bd8deadSopenharmony_ci current implementation of queries supports only global context. 4465bd8deadSopenharmony_ci 4475bd8deadSopenharmony_ci 4. How results of the query can be read? 4485bd8deadSopenharmony_ci 4495bd8deadSopenharmony_ci Results of the queries cannot be read before the entire drawing is done 4505bd8deadSopenharmony_ci by the GPU. This means that the application programmer has to decide 4515bd8deadSopenharmony_ci about the synchronization method it uses to read the query 4525bd8deadSopenharmony_ci results. There are the following options: 4535bd8deadSopenharmony_ci 4545bd8deadSopenharmony_ci - Use glFlush to trigger submission of any pending commands to the 4555bd8deadSopenharmony_ci GPU. Later check results availability with repetitive non-blocking 4565bd8deadSopenharmony_ci calls to GetPerfQueryDataINTEL function using the synchronization flag 4575bd8deadSopenharmony_ci of GL_PERFQUERY_DONOT_FLUSH_INTEL. 4585bd8deadSopenharmony_ci 4595bd8deadSopenharmony_ci - Use flag GL_PERFQUERY_FLUSH_INTEL in glGetPerfQueryDataINTEL to 4605bd8deadSopenharmony_ci trigger submission of any pending commands to the GPU. If results are 4615bd8deadSopenharmony_ci not immediately available, check their availability with repetitive 4625bd8deadSopenharmony_ci non-blocking calls to GetPerfQueryDataINTEL function using the 4635bd8deadSopenharmony_ci synchronization flag of GL_PERFQUERY_DONOT_FLUSH_INTEL. 4645bd8deadSopenharmony_ci 4655bd8deadSopenharmony_ci - Do a blocking call to glGetPerfQueryDataINTEL() with 4665bd8deadSopenharmony_ci GL_PERFQUERY_WAIT_INTEL flag set. The flag ensures that any pending GPU 4675bd8deadSopenharmony_ci commands are submitted and function blocks till GPU results are 4685bd8deadSopenharmony_ci available. 4695bd8deadSopenharmony_ci 4705bd8deadSopenharmony_ci It is allowed to perform simultaneous measurements with multiple active 4715bd8deadSopenharmony_ci queries of the same type. However it may be not allowed to perform 4725bd8deadSopenharmony_ci simultaneous measurements of queries with different types, as it may 4735bd8deadSopenharmony_ci require reprogramming of the same hardware part and could destroy the 4745bd8deadSopenharmony_ci hardware settings of the previous query. 4755bd8deadSopenharmony_ci 4765bd8deadSopenharmony_ci 5. Are query results always accurate? 4775bd8deadSopenharmony_ci 4785bd8deadSopenharmony_ci There are certain hardware conditions which may cause the results 4795bd8deadSopenharmony_ci of performance counters expressed in hardware clocks to be inaccurate. 4805bd8deadSopenharmony_ci The conditions may include: 4815bd8deadSopenharmony_ci 4825bd8deadSopenharmony_ci - Render clock change - the condition usually causes that all counter 4835bd8deadSopenharmony_ci values expressed in hardware clocks are incorrect. It is indicated by 4845bd8deadSopenharmony_ci FrequencyChanged flag. 4855bd8deadSopenharmony_ci 4865bd8deadSopenharmony_ci - Render commands split - in some cases GPU has to split execution of 4875bd8deadSopenharmony_ci drawing operations surrounded by the query into at least two 4885bd8deadSopenharmony_ci parts. The condition usually causes that counter values expressed in 4895bd8deadSopenharmony_ci time domains (in microseconds) may be substantially larger than the 4905bd8deadSopenharmony_ci average values of that counter. It is indicated by SplitOccured flag. 4915bd8deadSopenharmony_ci 4925bd8deadSopenharmony_ci - Rendering preemption - if GPU is shared among two or more 3D 4935bd8deadSopenharmony_ci applications, the hardware counters gathered in a global mode contain 4945bd8deadSopenharmony_ci additive results for these applications. The condition is also 4955bd8deadSopenharmony_ci indicated with SplitOccured flag. 4965bd8deadSopenharmony_ci 4975bd8deadSopenharmony_ci The above conditions are indicated in special fields in the query 4985bd8deadSopenharmony_ci results structures. It is up to the user to decide if the results are to 4995bd8deadSopenharmony_ci be processed further or dropped. In certain cases it can be determined 5005bd8deadSopenharmony_ci that the render commands split condition always occurs and has to be 5015bd8deadSopenharmony_ci accepted. 5025bd8deadSopenharmony_ci 5035bd8deadSopenharmony_ci 6. Are query results per-context or global? 5045bd8deadSopenharmony_ci 5055bd8deadSopenharmony_ci Some GPU platforms and/or driver versions support only global GPU 5065bd8deadSopenharmony_ci counters. In such cases, the query instance has to have 5075bd8deadSopenharmony_ci GL_PERFQUERY_GLOBAL_CONTEXT_INTEL flag set when creating query 5085bd8deadSopenharmony_ci instance. Otherwise, creation will fail and an INVALID_OPERATION error 5095bd8deadSopenharmony_ci will be generated. 5105bd8deadSopenharmony_ci 5115bd8deadSopenharmony_ci Support for a global context means that a single query instance measures 5125bd8deadSopenharmony_ci all GPU activities performed between query start and query end. Query 5135bd8deadSopenharmony_ci measures not only current OpenGL context but also activities of other 5145bd8deadSopenharmony_ci OpenGL contexts, other 3D API like DX and operating system windows draw 5155bd8deadSopenharmony_ci calls. 5165bd8deadSopenharmony_ci 5175bd8deadSopenharmony_ciProgram examples 5185bd8deadSopenharmony_ci 5195bd8deadSopenharmony_ci 1. Reading counter meta data example 5205bd8deadSopenharmony_ci 5215bd8deadSopenharmony_ci // query data has proprietary predefined structure layout 5225bd8deadSopenharmony_ci // associated with the vendor query ID 5235bd8deadSopenharmony_ci GL_QUERY_PIPELINE_METRICS * pQueryData; 5245bd8deadSopenharmony_ci 5255bd8deadSopenharmony_ci uint queryId; 5265bd8deadSopenharmony_ci uint nextQueryId; 5275bd8deadSopenharmony_ci uint queryHandle; 5285bd8deadSopenharmony_ci uint dataSize; 5295bd8deadSopenharmony_ci uint noCounters; 5305bd8deadSopenharmony_ci uint noInstances; 5315bd8deadSopenharmony_ci uint capsMask; 5325bd8deadSopenharmony_ci 5335bd8deadSopenharmony_ci const uint queryNameLen = 32; 5345bd8deadSopenharmony_ci char queryName[queryNameLen]; 5355bd8deadSopenharmony_ci 5365bd8deadSopenharmony_ci const uint counterNameLen = 32; 5375bd8deadSopenharmony_ci char counterName[counterNameLen]; 5385bd8deadSopenharmony_ci 5395bd8deadSopenharmony_ci const uint counterDescLen = 256; 5405bd8deadSopenharmony_ci char counterDesc[counterDescLen]; 5415bd8deadSopenharmony_ci 5425bd8deadSopenharmony_ci //get first vendor queryID 5435bd8deadSopenharmony_ci glGetFirstPerfQueryIdINTEL(&queryId); 5445bd8deadSopenharmony_ci 5455bd8deadSopenharmony_ci nextQueryId = queryId; 5465bd8deadSopenharmony_ci while(nextQueryId) 5475bd8deadSopenharmony_ci { 5485bd8deadSopenharmony_ci glGetPerfQueryInfoINTEL( 5495bd8deadSopenharmony_ci nextQueryId, 5505bd8deadSopenharmony_ci queryNameLen, 5515bd8deadSopenharmony_ci &queryName, 5525bd8deadSopenharmony_ci &dataSize, 5535bd8deadSopenharmony_ci &noCounters, 5545bd8deadSopenharmony_ci &noInstances, 5555bd8deadSopenharmony_ci &capsMask); 5565bd8deadSopenharmony_ci 5575bd8deadSopenharmony_ci for(int counterId = 1; counterId <= noCounters; counterId++) 5585bd8deadSopenharmony_ci { 5595bd8deadSopenharmony_ci uint counterOffset; 5605bd8deadSopenharmony_ci uint counterDataSize; 5615bd8deadSopenharmony_ci uint counterTypeEnum; 5625bd8deadSopenharmony_ci uint counterDataTypeEnum; 5635bd8deadSopenharmony_ci UINT64 rawCounterMaxValue; 5645bd8deadSopenharmony_ci 5655bd8deadSopenharmony_ci glGetPerfCounterInfoINTEL( 5665bd8deadSopenharmony_ci nextQueryId, 5675bd8deadSopenharmony_ci counterId, 5685bd8deadSopenharmony_ci counterNameLen, 5695bd8deadSopenharmony_ci counterName, 5705bd8deadSopenharmony_ci counterDescLen, 5715bd8deadSopenharmony_ci counterDesc, 5725bd8deadSopenharmony_ci &counterOffset, 5735bd8deadSopenharmony_ci &counterDataSize, 5745bd8deadSopenharmony_ci &counterTypeEnum, 5755bd8deadSopenharmony_ci &counterDataTypeEnum, 5765bd8deadSopenharmony_ci &rawCounterMaxValue); 5775bd8deadSopenharmony_ci 5785bd8deadSopenharmony_ci // use returned values here 5795bd8deadSopenharmony_ci ... 5805bd8deadSopenharmony_ci } 5815bd8deadSopenharmony_ci } 5825bd8deadSopenharmony_ci 5835bd8deadSopenharmony_ci 2. Measuring a single draw call example 5845bd8deadSopenharmony_ci 5855bd8deadSopenharmony_ci Note that GL_QUERY_PIPELINE_METRICS is a proprietary structure defined 5865bd8deadSopenharmony_ci by vendor and is used as example and function named according to the 5875bd8deadSopenharmony_ci convention of glFuntionINTEL are wrappers to dynamically linked-by-name 5885bd8deadSopenharmony_ci procedures. 5895bd8deadSopenharmony_ci 5905bd8deadSopenharmony_ci // query data has proprietary predefined structure layout 5915bd8deadSopenharmony_ci // associated with the vendor query ID 5925bd8deadSopenharmony_ci GL_QUERY_PIPELINE_METRICS * pQueryData; 5935bd8deadSopenharmony_ci 5945bd8deadSopenharmony_ci uint queryId; 5955bd8deadSopenharmony_ci uint queryHandle; 5965bd8deadSopenharmony_ci char queryName[] = "Intel_Pipeline_Query"; 5975bd8deadSopenharmony_ci 5985bd8deadSopenharmony_ci // get vendor queryID by name 5995bd8deadSopenharmony_ci glGetPerfQueryIdByNameINTEL(queryName, &queryId); 6005bd8deadSopenharmony_ci 6015bd8deadSopenharmony_ci // create query instance of queryId type 6025bd8deadSopenharmony_ci glCreatePerfQueryINTEL(queryId, &queryHandle); 6035bd8deadSopenharmony_ci 6045bd8deadSopenharmony_ci glBeginPerfQueryINTEL(queryHandle); // Start query 6055bd8deadSopenharmony_ci 6065bd8deadSopenharmony_ci glDrawElements(...); // Issue graphics commands, do whatever 6075bd8deadSopenharmony_ci 6085bd8deadSopenharmony_ci glEndPerfQueryINTEL(queryHandle); // End query 6095bd8deadSopenharmony_ci 6105bd8deadSopenharmony_ci // perform other application activities 6115bd8deadSopenharmony_ci 6125bd8deadSopenharmony_ci uint bytesWritten = 0; 6135bd8deadSopenharmony_ci uint dataSize = sizeof(GL_QUERY_PIPELINE_METRICS); 6145bd8deadSopenharmony_ci 6155bd8deadSopenharmony_ci pQueryData = (GL_QUERY_PIPELINE_METRICS *) malloc(dataSize); 6165bd8deadSopenharmony_ci 6175bd8deadSopenharmony_ci // for the first time use GL_PERFQUERY_FLUSH_INTEL flag to ensure graphics 6185bd8deadSopenharmony_ci // commands were submitted to hardware 6195bd8deadSopenharmony_ci 6205bd8deadSopenharmony_ci glGetPerfQueryDataINTEL( 6215bd8deadSopenharmony_ci queryHandle, 6225bd8deadSopenharmony_ci GL_PERFQUERY_FLUSH_INTEL, 6235bd8deadSopenharmony_ci dataSize, 6245bd8deadSopenharmony_ci pQueryData, 6255bd8deadSopenharmony_ci &bytesWritten); 6265bd8deadSopenharmony_ci 6275bd8deadSopenharmony_ci while(bytesWritten == 0) 6285bd8deadSopenharmony_ci { 6295bd8deadSopenharmony_ci // Now enough to use GL_PERFQUERY_DONOT_FLUSH_INTEL flag 6305bd8deadSopenharmony_ci glGetPerfQueryDataINTEL( 6315bd8deadSopenharmony_ci queryHandle, 6325bd8deadSopenharmony_ci GL__PERFQUERY_DONOT_FLUSH_INTEL, 6335bd8deadSopenharmony_ci dataSize, 6345bd8deadSopenharmony_ci pQueryData, 6355bd8deadSopenharmony_ci &bytesWritten); 6365bd8deadSopenharmony_ci } 6375bd8deadSopenharmony_ci 6385bd8deadSopenharmony_ci if(bytesWritten == dataSize) 6395bd8deadSopenharmony_ci { 6405bd8deadSopenharmony_ci // Use counters' data here 6415bd8deadSopenharmony_ci uint64 vertexShaderKernelsRunCount = 6425bd8deadSopenharmony_ci pQueryData->VertexShaderInvocations; 6435bd8deadSopenharmony_ci uint64 fragmentShaderKernelsRunCount = 6445bd8deadSopenharmony_ci pQueryData->FragmentShaderInvocations; 6455bd8deadSopenharmony_ci ... 6465bd8deadSopenharmony_ci } 6475bd8deadSopenharmony_ci else 6485bd8deadSopenharmony_ci { 6495bd8deadSopenharmony_ci // error handling case 6505bd8deadSopenharmony_ci } 6515bd8deadSopenharmony_ci 6525bd8deadSopenharmony_ci glDeletePerfQueryINTEL(queryHandle); // query instance is released 6535bd8deadSopenharmony_ci 6545bd8deadSopenharmony_ci 3. Measuring multiple draw calls with synchronous wait for result 6555bd8deadSopenharmony_ci 6565bd8deadSopenharmony_ci Note that GL_QUERY_HD_HW_METRICS is a proprietary structure defined by 6575bd8deadSopenharmony_ci vendor and is used as example and function named according to the 6585bd8deadSopenharmony_ci convention of glFuntionINTEL are wrappers to dynamically linked-by-name 6595bd8deadSopenharmony_ci procedures. 6605bd8deadSopenharmony_ci 6615bd8deadSopenharmony_ci // query data has proprietary predefined structure layout 6625bd8deadSopenharmony_ci // associated with the vendor query ID 6635bd8deadSopenharmony_ci GL_QUERY_HD_HW_METRICS * pQueryData; 6645bd8deadSopenharmony_ci 6655bd8deadSopenharmony_ci uint queryId; 6665bd8deadSopenharmony_ci UINT32 queryHandle[1000]; 6675bd8deadSopenharmony_ci char queryName[] = "Intel_HD_Hardware_Counters"; 6685bd8deadSopenharmony_ci 6695bd8deadSopenharmony_ci // get vendor queryID by name 6705bd8deadSopenharmony_ci glGetPerfQueryIdByNameINTEL(queryName, &queryId); 6715bd8deadSopenharmony_ci 6725bd8deadSopenharmony_ci // create memory for 1000 results 6735bd8deadSopenharmony_ci uint dataSize = sizeof(GL_QUERY_HD_HW_METRICS); 6745bd8deadSopenharmony_ci pQueryData = (GL_QUERY_HD_HW_METRICS *) malloc(dataSize * 1000); 6755bd8deadSopenharmony_ci 6765bd8deadSopenharmony_ci // create 1000 query instances of queryId type 6775bd8deadSopenharmony_ci for(int i = 0; i < 1000; i++) 6785bd8deadSopenharmony_ci { 6795bd8deadSopenharmony_ci glCreatePerfQueryINTEL(queryId, &queryHandle[i]); 6805bd8deadSopenharmony_ci } 6815bd8deadSopenharmony_ci 6825bd8deadSopenharmony_ci uint currentDrawNumber = 0; 6835bd8deadSopenharmony_ci 6845bd8deadSopenharmony_ci // start 1st query 6855bd8deadSopenharmony_ci glBeginPerfQueryINTEL(queryHandle[currentDrawNumber]); 6865bd8deadSopenharmony_ci 6875bd8deadSopenharmony_ci glDrawElements(...); // Issue graphics commands 6885bd8deadSopenharmony_ci 6895bd8deadSopenharmony_ci // end query 6905bd8deadSopenharmony_ci glEndPerfQueryINTEL(queryHandle[currentDrawNumber++]); 6915bd8deadSopenharmony_ci 6925bd8deadSopenharmony_ci ... 6935bd8deadSopenharmony_ci 6945bd8deadSopenharmony_ci // start nth query 6955bd8deadSopenharmony_ci glBeginPerfQueryINTEL(queryHandle[currentDrawNumber]); 6965bd8deadSopenharmony_ci 6975bd8deadSopenharmony_ci glDrawElements(...); // Issue graphics commands 6985bd8deadSopenharmony_ci 6995bd8deadSopenharmony_ci // end query 7005bd8deadSopenharmony_ci glEndPerfQueryINTEL(queryHandle[currentDrawNumber++]); 7015bd8deadSopenharmony_ci 7025bd8deadSopenharmony_ci ... 7035bd8deadSopenharmony_ci 7045bd8deadSopenharmony_ci // assume currentDrawNumber == 1000 here 7055bd8deadSopenharmony_ci // so get all results after these 1000 draws 7065bd8deadSopenharmony_ci 7075bd8deadSopenharmony_ci GL_QUERY_HD_HW_METRICS *pData = pQueryData; 7085bd8deadSopenharmony_ci 7095bd8deadSopenharmony_ci for(int i = 0; i < 1000; i++) 7105bd8deadSopenharmony_ci { 7115bd8deadSopenharmony_ci uint bytesWritten = 0; 7125bd8deadSopenharmony_ci 7135bd8deadSopenharmony_ci // use GL_PERFQUERY_WAIT_INTEL flag to cause the function will wait 7145bd8deadSopenharmony_ci // for the query completion 7155bd8deadSopenharmony_ci glGetPerfQueryDataINTEL( 7165bd8deadSopenharmony_ci queryHandle[i], 7175bd8deadSopenharmony_ci GL_PERFQUERY_WAIT_INTEL, 7185bd8deadSopenharmony_ci dataSize, 7195bd8deadSopenharmony_ci pData, 7205bd8deadSopenharmony_ci &bytesWritten); 7215bd8deadSopenharmony_ci 7225bd8deadSopenharmony_ci if(bytesWritten != sizeof(GL_QUERY_HD_HW_METRICS)) 7235bd8deadSopenharmony_ci { 7245bd8deadSopenharmony_ci // query error case 7255bd8deadSopenharmony_ci assert(false); 7265bd8deadSopenharmony_ci ... 7275bd8deadSopenharmony_ci // some cleanup needed also 7285bd8deadSopenharmony_ci ... 7295bd8deadSopenharmony_ci return ERROR; 7305bd8deadSopenharmony_ci } 7315bd8deadSopenharmony_ci 7325bd8deadSopenharmony_ci pData++; 7335bd8deadSopenharmony_ci } 7345bd8deadSopenharmony_ci 7355bd8deadSopenharmony_ci // use counters data 7365bd8deadSopenharmony_ci ... 7375bd8deadSopenharmony_ci 7385bd8deadSopenharmony_ci // repeat measurements if needed reusing the query instances 7395bd8deadSopenharmony_ci ... 7405bd8deadSopenharmony_ci 7415bd8deadSopenharmony_ci // query instances are no longer needed so release all of them 7425bd8deadSopenharmony_ci for(int i = 0; i < 1000; i++) 7435bd8deadSopenharmony_ci { 7445bd8deadSopenharmony_ci glDeletePerfQueryINTEL(queryHandle[i]); 7455bd8deadSopenharmony_ci } 7465bd8deadSopenharmony_ci 7475bd8deadSopenharmony_ci return SUCCESS; 7485bd8deadSopenharmony_ci 7495bd8deadSopenharmony_ciRevision History 7505bd8deadSopenharmony_ci 7515bd8deadSopenharmony_ci 1.3 20/12/13 Jon Leech Assign extension #s and enum values. Fix 7525bd8deadSopenharmony_ci a few typos (Bug 11345). 7535bd8deadSopenharmony_ci 7545bd8deadSopenharmony_ci 1.2 29/11/13 sgrajewski Extension upgraded to 4.4 core specification. 7555bd8deadSopenharmony_ci ES3.0.2 dependencies added. 7565bd8deadSopenharmony_ci 7575bd8deadSopenharmony_ci 1.1 06/06/11 puminski Initial revision. 758