Directx 11 zip file download






















Then we can iteratively apply SphereInsidePlane function to determine if the sphere is contained inside the culling frustum. Since the sphere is described in view space, we can quickly determine if the light should be culled based on its z-position and the distance to the near and far clipping planes.

If the sphere is either fully in front of the near clipping plane, or fully behind the far clipping plane, then the light can be discarded. Otherwise we have to check if the light is within the bounds of the culling frustum.

The SphereInsideFrustum assumes a right-handed coordinate system with the camera looking towards the negative z axis. In this case, the far plane is approaching negative infinity so we have to check if the sphere is further away less than in the negative direction.

For a left-handed coordinate system, the zNear and zFar variables should be swapped on line To test if a cone is completely contained in the negative half-space of a plane, only two points need to be tested. If both of these points are contained in the negative half-space of any of the frustum planes, then the cone can be culled. This special case does not need to be handled specifically because in this case the equation reduces to:. If they are, we can conclude that the light can be culled.

To test if a point is in the negative half-space of the plane, we can use the following equation:. And the ConeInsidePlane function is used to test if a cone is fully contained in the negative half-space of a plane. The ConeInsideFrustum function is used to test if the cone is contained within the clipping frustum.

This function will return true if the cone is inside the frustum or false if it is fully contained in the negative half-space of any of the clipping planes. First we check if the cone is clipped by the near or far clipping planes. Otherwise we have to check the four planes of the culling frustum. If the cone is in the negative half-space of any of the clipping planes, the function will return false. The purpose of the light culling compute shader is to update the global light index list and the light grid that is required by the fragment shader.

Two lists need to be updated per frame:. Both lists will be updated in the light culling compute shader. In order to read the depth values that are generated the depth pre-pass, the resulting depth texture will need to be sent to the light culling compute shader. The DepthTextureVS texture contains the result of the depth pre-pass.

Although the light index counters are of type RWStructuredBuffer these buffers only contain a single unsigned integer at index 0. To store the min and max depth values per tile, we need to declare some group-shared variables to store the minimum and maximum depth values. To circumvent this limitation, the depth values will be stored as unsigned integers in group-shared memory which will be atomically compared and updated per thread.

Since the frustum used to perform culling will be the same frustum for all threads in a group, it makes sense to keep only one copy of the frustum for all threads in a group.

Only thread 0 in the group will need to copy the frustum from the global memory buffer and we also reduce the amount of local register memory required per thread. We also need to declare group-shared variables to create the temporary light lists.

We will need a seperate list for opaque and transparent geometry. The LightCount will keep track of the number of lights that are intersecting the current tile frustum. The LightIndexStartOffset is the offset into the global light index list.

This index will be written to the light grid and is used as the starting offset when copying the local light index list to global light index list.

The local light index list will allow us to store as many as lights in a single tile. Keep in mind that when we allocated storage for the global light list, we accounted for an average of lights per tile.

It is possible that there are some tiles that contain more than lights as long as it is not more than and some tiles that contain less than lights but we expect the average to be about lights per tile. As previously mentioned, the estimate of an average of lights per tile is probably an overestimation but since GPU memory is not a limiting constraint for this project, I can afford to be liberal with my estimations. To update the local light counter and the light list, I will define a helper function called AppendLight.

Unfortunately I have not yet figured out how to pass group-shared variables as arguments to a function so for now I will define two versions of the same function. One version of the function is used to update the light index list for opaque geometry and the other version is for transparent geometry.

The InterlockedAdd function guarantees that the group-shared light count variable is only updated by a single thread at a time. This way we avoid any race conditions that may occur when multiple threads try to increment the group-shared light count at the same time. The value of the light count before it is incremented is stored in the index local variable and used to update the light index in the group-shared light index list. The first thing we will do in the light culling compute shader is read the depth value for the current thread.

Each thread in the thread group will sample the depth buffer only once for the current thread and thus all threads in a group will sample all depth values for a single tile. Since we can only perform atomic operations on integers, on line we reinterrpret the bits from the floating-point depth as an unsigned integer. Since we expect all depth values in the depth map to be stored in the range [0…1] that is, all positive depth values then reinturrpreting the float to an int will still allow us to correctly perform comparissons on these values.

Since we are setting group-shared variables, only one thread in the group needs to set them. To make sure that every thread in the group has reached the same point in the compute shader, we invoke the GroupMemoryBarrierWithGroupSync function. This ensures that any writes to group shared memory have completed and the thread execution for all threads in a group have reached this point.

The InterlockedMin and InterlockedMax methods are used to atomically update the uMinDepth and uMaxDepth group-shared variables based on the current threads depth value. We again need to use the GroupMemoryBarrierWithGroupSync function to ensure all writes to group shared memory have been comitted and all threads in the group have reached this point in the compute shader.

After the minimum and maximum depth values for the current tile have been found, we can reinterrpret the unsigned integer back to a float so that we can use it to compute the view space clipping planes for the current tile. On line the minimum and maximum depth values as unsigned integers need to be reinterpret as floating point values so that they can be used to compute the correct points in view space. The view space depth values are computed using the ScreenToView function and extracting the z component of the position in view space.

We only need these values to compute the near and far clipping planes in view space so we only need to know the distance from the viewer.

We can verify that this is correct by using the constant-normal form of a plane:. For 10, lights, the for loop only needs 40 iterations per thread to check all lights for a tile. First we check if the light is within the tile frustum using the near clipping plane of the camera and the maximum depth read from the depth buffer. If the light volume is in this range, it is added to the light index list for transparent geometry.

To check if the light should be added to the global light index list for opaque geometry, we only need to check the minimum depth clipping plane that was previously defined on line If the light is within the culling frustum for transparent geometry and in front of the minimum depth clipping plane, the index of the light is added to the light index list for opaque geometry.

The radius of the base of the spotlight cone is not stored with the light so it needs to be calculated for the ConeInsideFrustum function. To compute the radius of the base of the cone, we can use the tangent of the spotlight angle multiplied by the height of the cone. And finally we need to check directional lights. This is by far the easiest part of this function. To ensure that all threads in the thread group have recorded their lights to the group-shared light index list, we will invoke the GroupMemoryBarrierWithGroupSync function to synchronize all threads in the group.

After we have added all non-culled lights to the group-shared light index lists we need to copy it to the global light index list. We will once again use the InterlockedAdd function to increment the global light index list counter by the number of lights that were appended to the group-shared light index list. On lines and the light grid is updated with the offset and light count of the global light index list.

To avoid race conditions, only the first thread in the thread group will be used to update the global memory. On line , all threads in the thread group must be synced again before we can update the global light index list.

To update the opaque and transparent global light index lists, we will allow all threads to write a single index into the light index list using a similar method that was used to iterate the light list on lines shown previously. At this point both the light grid and the global light index list contain the necessary data to be used by the pixel shader to perform final shading.

This method is no different from the standard forward rendering technique that was discussed in the section titled Forward Rendering — Pixel Shader except that instead of looping through the entire global light list, we use the light index list that was generated in the light culling phase.

When rendering opaque geometry, you must take care to bind the light index list and light grid for opaque geometry and when rendering transparent geometry, the light index list and light grid for transparent geometry.

Of course this seems obvious but the only differentiating factor for the final shading pixel shader is the light index list and light grid that is bound to the pixel shader stage. Most of the code for this pixel shader is identical to that of the forward rendering pixel shader so it is omitted here for brevity.

The primary concept here is shown on line where the tile index into the light grid is computed from the screen space position. Using the tile index, the start offset and light count is read from the light grid on lines and The camera was placed close to the world origin and the lights were animated to rotate in a circle around the world origin. Having a few large lights in the scene is a realistic scenario for example key light, fill light, and back light [25].

These lights may be shadow casters that set the mood and create the ambient for the scene. Having many more than 5 large lights that fill the screen is not necessarily a realistic scenario but I wanted to see how the various techniques scaled when using large, screen-filling lights. Having many small lights is a more realistic scenario that might be commonly used in games. Many small lights can be used to simulate area lights or bounced lighting effects similar to the effects of global illumination algorithms that are usually only simulated using light maps or light probes as described in the section titled Forward Rendering.

Although the demo supports directional lights I did not test the performance of rendering using directional lights. Directional lights are large screen filling lights that are similar to lights having a range of units the first scenario.

In both scenarios lights were randomly placed throughout the scene within the boundaries of the scene. The sponza scene was scaled down so that its bounds were approximately 30 units in the X and Z axes and 15 units in the Y axis. Each graph displays a set of curves that represent the various phases of the rendering technique. The horizontal axis of the curve represents the number of lights in the scene and the vertical axis represents the running time measured in milliseconds.

Each graph also displays a minimum and maximum threshold. The minimum threshold is displayed as a green horizontal line in the graph and represents the ideal frame-rate of 60 Frames-Per Second FPS or The maximum threshold is displayed as a red horizontal line in the graph and represents the lowest acceptable frame-rate of 30 FPS or The graph below shows the performance results of the forward rendering technique using large lights.

Forward Rendering Light Range: Units. The graph displays the two primary phases of the forward rendering technique. The purple curve shows the opaque pass and the dark red curve shows the transparent pass. The orange line shows the total time to render the scene. As can be seen by this graph, rendering opaque geometry takes the most amount of time and increases exponentially as the number of lights increases.

The time to render transparent geometry also increases exponentially but there is much less transparent geometry in the scene than opaque geometry so the increase seems more gradual. Even with very large lights, standard forward rendering is able to render 64 dynamic lights while still maintaining frame-rates below the maximum threshold of 30 FPS.

With more than lights, the frame time becomes immeasurably high. From this we can conclude that if the scene contains more than 64 large visible lights, you may want to consider using a different rendering technique than forward rendering. Forward rendering performs better when the scene contains many small lights. In this case, the rendering technique can handle twice as many lights while still maintaining acceptable performance.

After more than lights, the frame time was so high, it was no longer worth measuring. We see again that the most amount of time is spent rendering opaque geometry which is not surprising. The trends for both large and small lights are similar but when using small lights, we can create twice as many lights while achieving acceptable frame-rates. The same experiment was repeated but this time using the deferred rendering technique. Rendering large lights using deferred rendering proved to be only marginally better than forward rendering.

Since rendering transparent geometry uses the exact same code paths as the forward rendering technique, the performance of rendering transparent geometry using forward versus deferred rendering are virtually identical. As expected, there is no performance benefit when rendering transparent geometry.

The marginal performance benefit of rendering opaque geometry using deferred rendering is primarily due to the reduced number of redundant lighting computations that forward rendering performs on occluded geometry. Redundant lighting computations that are performed when using forward rendering can be mitigated by using a depth pre-pass which would allow for early z-testing to reject fragments before performing expensive lighting calculations. Deferred rendering implicitly benefits from early z-testing and stencil operations that are not performed during forward rendering.

The graph shows that deferred rendering is capable of rendering small dynamic lights while still maintaining acceptable frame rates. In this case the time to render transparent geometry greatly exceeds that of rendering opaque geometry. If rendering only opaque objects, then the deferred rendering technique is capable of rendering lights while maintaining frame-rates below the minimum acceptable threshold of 60 FPS.

Rendering transparent geometry greatly exceeds the maximum threshold after about lights. The same experiment was repeated once again using tiled forward rendering. First we will analyze at the performance characteristics using large lights.

The graph below shows the performance results of tiled forward rendering using large scene lights. The graph shows that tiled forward rendering is not well suited for rendering scenes with many large lights. Rendering screen filling lights in the scene caused issues because the demo only accounts for having an average of lights per tile. With large lights the light average was exceeded and many tiles simply appeared black. Using large lights, the light culling phase never exceeded 1 ms but the opaque pass and the transparent pass quickly exceeded the maximum frame-rate threshold of 30 FPS.

Forward plus really shines when using many small lights. In this case we see that the light culling phase orange line is the primary bottleneck of the rendering technique. Even with over 16, lights, rendering opaque blue line and transparent purple line geometry fall below the minimum threshold to achieve a desired frame-rate of 60 FPS.

The majority of the frame time is consumed by the light culling phase. As expected, forward rendering is the most expensive rendering algorithm when rendering large lights. Deferred rendering and tiled forward rendering are comparable in performance. Even if we disregard rendering transparent geometry in the scene, deferred rendering and tiled forward rendering have similar performance characteristics.

If we consider scenes with only a few large lights there is still no discernible performance benefits between forward, deferred, or forward plus rendering. If we consider the memory footprint required to perform forward rendering versus deferred rendering versus tiled forward rendering then traditional forward rendering has the smallest memory usage.

Regardless of the number of lights in the scene, deferred rendering requires about four bytes of GPU memory per pixel per additional G-buffer render target. Tiled forward rendering requires additional GPU storage for the light index list and the light grid which must be stored even when the scene contains only a few dynamic lights.

The additional storage requirements for deferred rendering is based on an additional three full-screen buffers at bits 4 bytes per pixel. If GPU storage is a rare commodity for the target platform and there is no need for many lights in the scene, traditional forward rendering is still the best choice. In the case of small lights, tiled forward rendering clearly comes out as the winner in terms of rendering times. Up until somewhere around lights, deferred and tiled forward rendering are comparable in performance but quickly diverge when the scene contains many dynamic lights.

Also we must consider the fact that a large portion of the deferred rendering technique is consumed by rendering transparent objects. If transparent objects are not a requirement, then deferred rendering may be a viable option. Even with small lights, deferred rendering requires many more draw calls to render the geometry of the light volumes. Using deferred rendering, each light volume must be rendered at least twice, the first draw call updates the stencil buffer and the second draw call performs the lighting equations.

If the graphics platform is very sensitive to excessive draw calls, then deferred rendering may not be the best choice. Similar to the scenario with large lights, when rendering only a few lights in the scene then all three techniques have similar performance characteristics. In this case, we must consider the additional memory requirements that are imposed by deferred and tiled forward rendering. Again, if GPU memory is scarce and there is no need for many dynamic lights in the scene then standard forward rendering may be a viable solution.

While working on this project I have identified several issues that would benefit from consideration in the future. For each of the rendering techniques used in this demo there is only a single global light list which stores directional, point, and spotlights in a single data structure. In order to store all of the properties necessary to perform correct lighting, each individual light structure requires bytes of GPU memory.

If we only store the absolute minimum amount of information needed to describe a light source we could take advantage of improved caching of the light data and potentially improve rendering performance across all rendering techniques. This may require having additional data structures to store only the relevant information that is needed by either the compute or the fragment shader or creating separate lists for directional, spot, and point lights so that no redundant information that is not relevant to the light source is stored in the data structure.

This implementation of the forward rendering technique makes no attempt to optimize the forward rendering pipeline. Culling lights against the view frustum would be a reasonable method to improve the rendering performance of the forward renderer.

Microsoft OneDrive 32 bit Setup. Microsoft OneDrive Microsoft Malware Protection 1. Update Package. MSI 4. LZMA library. Autodesk component. AutoCAD Order dll files by: version description language. MD5: fa6efb9d58cedb. SHA 88ec05afe37ccceea2. MD5: 3edd65ee32ec6fbfda SHA fa75bdfe29bcbd26a05aa4ad. MD5: b5aca3a67aafe SHA ce1b3f1f3d1aed63c64ebf1ceda1. MD5: aebce03dfab SHA dea2b43d6b20acc80e64eafd3. MD5: 9eae88c4bbd14bfd14d68c2ede2.

SHA a8dabfcaa2cbec3c. Select Language:. Details Version:. File Name:. The computer will restart during the restore process, and then Windows will load with a message confirming that the restore was successful. Check that DirectX was rolled back. This will open the DXDiag tool, which will check your system and report the version of DirectX that is installed.

The DirectX version will be listed at the bottom of the System Information section on the first tab. You must have some version of DirectX installed with Windows. Windows 7 and later cannot have anything less than DirectX 11 installed. Method 2. Run the DirectX Diagnostic Tool. This will open the DirectX Diagnostic Tool.

It will display an overview of your system. You can click each tab to see information on your display, sound, and inputs. A text box at the bottom of each tab will tell you if there are issues detected with that particular system. Download the DirectX installer from Microsoft but not the file shown in the video above, that is a powerpoint presentation.

The best way to try to fix this is by reinstalling the latest version of DirectX. The installer is available for free from Microsoft. If you are having issues, upgrading to the latest version may help much more than uninstalling.

Run the installer. The installer will scan your system and then install the necessary files to update your copy of DirectX to the latest version. Many times, updating your video card drivers will help fix DirectX errors for games and other video-centric programs.



0コメント

  • 1000 / 1000