SSE1 JavaScript Benchmark

System Info:
Native Clang Compiler:
clang version 11.0.0 (https://github.com/llvm/llvm-project.git cd12e79e6ddd235739716fff5c6a748915f664b9)
Target: x86_64-apple-darwin19.4.0
Thread model: posix
InstalledDir: /Users/clb/emsdk/llvm/git/build_master_64/bin
Emscripten Compiler:

Native _mm_load_ps: 2.5459ns -> 0.5112ns. Native SSE1 is 4.98x FASTER than native scalar.        
Native _mm_load_ps1: 2.5459ns -> 0.5079ns. Native SSE1 is 5.01x FASTER than native scalar.        
Native _mm_load_ss: 2.5459ns -> 0.4913ns. Native SSE1 is 5.18x FASTER than native scalar.        
Native _mm_load1_ps: 2.5459ns -> 0.5028ns. Native SSE1 is 5.06x FASTER than native scalar.        
Native _mm_loadh_pi: 2.5459ns -> 0.5043ns. Native SSE1 is 5.05x FASTER than native scalar.        
Native _mm_loadl_pi: 2.5459ns -> 0.5011ns. Native SSE1 is 5.08x FASTER than native scalar.        
Native _mm_loadr_ps: 2.5459ns -> 0.5045ns. Native SSE1 is 5.05x FASTER than native scalar.        
Native _mm_loadu_ps: 2.5459ns -> 0.4825ns. Native SSE1 is 5.28x FASTER than native scalar.        

JS _mm_load_ps: 4.1821ns -> 0.9198ns. JS SSE1 is 4.55x FASTER than JS scalar.        
JS _mm_load_ps1: 4.1821ns -> 0.7721ns. JS SSE1 is 5.42x FASTER than JS scalar.        
JS _mm_load_ss: 4.1821ns -> 0.7578ns. JS SSE1 is 5.52x FASTER than JS scalar.        
JS _mm_load1_ps: 4.1821ns -> 0.5895ns. JS SSE1 is 7.09x FASTER than JS scalar.        
JS _mm_loadh_pi: 4.1821ns -> 0.8613ns. JS SSE1 is 4.86x FASTER than JS scalar.        
JS _mm_loadl_pi: 4.1821ns -> 1.0900ns. JS SSE1 is 3.84x FASTER than JS scalar.        
JS _mm_loadr_ps: 4.1821ns -> 0.7955ns. JS SSE1 is 5.26x FASTER than JS scalar.        
JS _mm_loadu_ps: 4.1821ns -> 0.7406ns. JS SSE1 is 5.65x FASTER than JS scalar.        

JS _mm_load_ps: JS scalar is 1.64x SLOWER than native scalar.        
JS _mm_load_ps1: JS scalar is 1.64x SLOWER than native scalar.        
JS _mm_load_ss: JS scalar is 1.64x SLOWER than native scalar.        
JS _mm_load1_ps: JS scalar is 1.64x SLOWER than native scalar.        
JS _mm_loadh_pi: JS scalar is 1.64x SLOWER than native scalar.        
JS _mm_loadl_pi: JS scalar is 1.64x SLOWER than native scalar.        
JS _mm_loadr_ps: JS scalar is 1.64x SLOWER than native scalar.        
JS _mm_loadu_ps: JS scalar is 1.64x SLOWER than native scalar.        

JS _mm_load_ps: JS SSE1 is 1.80x SLOWER than native SSE1.        
JS _mm_load_ps1: JS SSE1 is 1.52x SLOWER than native SSE1.        
JS _mm_load_ss: JS SSE1 is 1.54x SLOWER than native SSE1.        
JS _mm_load1_ps: JS SSE1 is 1.17x SLOWER than native SSE1.        
JS _mm_loadh_pi: JS SSE1 is 1.71x SLOWER than native SSE1.        
JS _mm_loadl_pi: JS SSE1 is 2.18x SLOWER than native SSE1.        
JS _mm_loadr_ps: JS SSE1 is 1.58x SLOWER than native SSE1.        
JS _mm_loadu_ps: JS SSE1 is 1.53x SLOWER than native SSE1.        

Native _mm_set_ps: 2.5459ns -> 0.5385ns. Native SSE1 is 4.73x FASTER than native scalar.        
Native _mm_set_ps1: 2.5459ns -> 0.5126ns. Native SSE1 is 4.97x FASTER than native scalar.        
Native _mm_set_ss: 2.5459ns -> 0.4898ns. Native SSE1 is 5.20x FASTER than native scalar.        
Native _mm_set1_ps: 2.5459ns -> 0.5111ns. Native SSE1 is 4.98x FASTER than native scalar.        
Native _mm_setr_ps: 2.5459ns -> 0.5430ns. Native SSE1 is 4.69x FASTER than native scalar.        
Native _mm_setzero_ps: 2.5459ns -> 0.1724ns. Native SSE1 is 14.76x FASTER than native scalar.        

JS _mm_set_ps: 4.1821ns -> 0.6120ns. JS SSE1 is 6.83x FASTER than JS scalar.        
JS _mm_set_ps1: 4.1821ns -> 0.6768ns. JS SSE1 is 6.18x FASTER than JS scalar.        
JS _mm_set_ss: 4.1821ns -> 0.8103ns. JS SSE1 is 5.16x FASTER than JS scalar.        
JS _mm_set1_ps: 4.1821ns -> 0.9319ns. JS SSE1 is 4.49x FASTER than JS scalar.        
JS _mm_setr_ps: 4.1821ns -> 0.8277ns. JS SSE1 is 5.05x FASTER than JS scalar.        
JS _mm_setzero_ps: 4.1821ns -> 0.5332ns. JS SSE1 is 7.84x FASTER than JS scalar.        

JS _mm_set_ps: JS scalar is 1.64x SLOWER than native scalar.        
JS _mm_set_ps1: JS scalar is 1.64x SLOWER than native scalar.        
JS _mm_set_ss: JS scalar is 1.64x SLOWER than native scalar.        
JS _mm_set1_ps: JS scalar is 1.64x SLOWER than native scalar.        
JS _mm_setr_ps: JS scalar is 1.64x SLOWER than native scalar.        
JS _mm_setzero_ps: JS scalar is 1.64x SLOWER than native scalar.        

JS _mm_set_ps: JS SSE1 is 1.14x SLOWER than native SSE1.        
JS _mm_set_ps1: JS SSE1 is 1.32x SLOWER than native SSE1.        
JS _mm_set_ss: JS SSE1 is 1.65x SLOWER than native SSE1.        
JS _mm_set1_ps: JS SSE1 is 1.82x SLOWER than native SSE1.        
JS _mm_setr_ps: JS SSE1 is 1.52x SLOWER than native SSE1.        
JS _mm_setzero_ps: JS SSE1 is 3.09x SLOWER than native SSE1.        

Native _mm_shuffle_ps: 0.0000ns -> 0.0087ns. Native SSE1 is 8705.00x SLOWER than native scalar.        
Native _mm_unpackhi_ps: 0.2817ns -> 0.0087ns. Native SSE1 is 32.33x FASTER than native scalar.        
Native _mm_unpacklo_ps: 0.2808ns -> 0.0180ns. Native SSE1 is 15.63x FASTER than native scalar.        

JS _mm_shuffle_ps: 0.0001ns -> 0.2164ns. JS SSE1 is 3607.07x SLOWER than JS scalar.        
JS _mm_unpackhi_ps: 0.2871ns -> 0.0870ns. JS SSE1 is 3.30x FASTER than JS scalar.        
JS _mm_unpacklo_ps: 0.2816ns -> 0.0871ns. JS SSE1 is 3.23x FASTER than JS scalar.        

JS _mm_shuffle_ps: JS scalar is 60.00x SLOWER than native scalar.        
JS _mm_unpackhi_ps: JS scalar is 1.02x SLOWER than native scalar.        
JS _mm_unpacklo_ps: JS scalar is 1.00x SLOWER than native scalar.        

JS _mm_shuffle_ps: JS SSE1 is 24.86x SLOWER than native SSE1.        
JS _mm_unpackhi_ps: JS SSE1 is 9.99x SLOWER than native SSE1.        
JS _mm_unpacklo_ps: JS SSE1 is 4.85x SLOWER than native SSE1.        

Native _mm_max_ps: 2.0442ns -> 0.2855ns. Native SSE1 is 7.16x FASTER than native scalar.        
Native _mm_max_ss: 2.0442ns -> 0.2896ns. Native SSE1 is 7.06x FASTER than native scalar.        
Native _mm_min_ps: 2.3316ns -> 0.2819ns. Native SSE1 is 8.27x FASTER than native scalar.        
Native _mm_min_ss: 2.3316ns -> 0.2785ns. Native SSE1 is 8.37x FASTER than native scalar.        

JS _mm_max_ps: 2.0055ns -> 0.4991ns. JS SSE1 is 4.02x FASTER than JS scalar.        
JS _mm_max_ss: 2.0055ns -> 0.5715ns. JS SSE1 is 3.51x FASTER than JS scalar.        
JS _mm_min_ps: 2.0197ns -> 0.5060ns. JS SSE1 is 3.99x FASTER than JS scalar.        
JS _mm_min_ss: 2.0197ns -> 0.5690ns. JS SSE1 is 3.55x FASTER than JS scalar.        

JS _mm_max_ps: JS scalar is 1.02x FASTER than native scalar.        
JS _mm_max_ss: JS scalar is 1.02x FASTER than native scalar.        
JS _mm_min_ps: JS scalar is 1.15x FASTER than native scalar.        
JS _mm_min_ss: JS scalar is 1.15x FASTER than native scalar.        

JS _mm_max_ps: JS SSE1 is 1.75x SLOWER than native SSE1.        
JS _mm_max_ss: JS SSE1 is 1.97x SLOWER than native SSE1.        
JS _mm_min_ps: JS SSE1 is 1.79x SLOWER than native SSE1.        
JS _mm_min_ss: JS SSE1 is 2.04x SLOWER than native SSE1.        

Native _mm_move_ss: 2.5459ns -> 0.7399ns. Native SSE1 is 3.44x FASTER than native scalar.        
Native _mm_movehl_ps: 2.5459ns -> 0.7105ns. Native SSE1 is 3.58x FASTER than native scalar.        
Native _mm_movelh_ps: 2.5459ns -> 0.6946ns. Native SSE1 is 3.67x FASTER than native scalar.        

JS _mm_move_ss: 4.1821ns -> 1.0933ns. JS SSE1 is 3.83x FASTER than JS scalar.        
JS _mm_movehl_ps: 4.1821ns -> 0.8954ns. JS SSE1 is 4.67x FASTER than JS scalar.        
JS _mm_movelh_ps: 4.1821ns -> 0.8847ns. JS SSE1 is 4.73x FASTER than JS scalar.        

JS _mm_move_ss: JS scalar is 1.64x SLOWER than native scalar.        
JS _mm_movehl_ps: JS scalar is 1.64x SLOWER than native scalar.        
JS _mm_movelh_ps: JS scalar is 1.64x SLOWER than native scalar.        

JS _mm_move_ss: JS SSE1 is 1.48x SLOWER than native SSE1.        
JS _mm_movehl_ps: JS SSE1 is 1.26x SLOWER than native SSE1.        
JS _mm_movelh_ps: JS SSE1 is 1.27x SLOWER than native SSE1.        

Native _mm_store_ps: 2.5459ns -> 0.4673ns. Native SSE1 is 5.45x FASTER than native scalar.        
Native _mm_store_ps1: 2.5459ns -> 0.5081ns. Native SSE1 is 5.01x FASTER than native scalar.        
Native _mm_store_ss: 2.5459ns -> 0.4945ns. Native SSE1 is 5.15x FASTER than native scalar.        
Native _mm_storeh_pi: 2.5459ns -> 0.4994ns. Native SSE1 is 5.10x FASTER than native scalar.        
Native _mm_storel_pi: 2.5459ns -> 0.4906ns. Native SSE1 is 5.19x FASTER than native scalar.        
Native _mm_storer_ps: 2.5459ns -> 0.5058ns. Native SSE1 is 5.03x FASTER than native scalar.        
Native _mm_storeu_ps: 2.5459ns -> 0.5209ns. Native SSE1 is 4.89x FASTER than native scalar.        
Native _mm_stream_ps: 2.5459ns -> 0.3715ns. Native SSE1 is 6.85x FASTER than native scalar.        

JS _mm_store_ps: 4.1821ns -> 0.6469ns. JS SSE1 is 6.46x FASTER than JS scalar.        
JS _mm_store_ps1: 4.1821ns -> 1.0589ns. JS SSE1 is 3.95x FASTER than JS scalar.        
JS _mm_store_ss: 4.1821ns -> 0.5749ns. JS SSE1 is 7.27x FASTER than JS scalar.        
JS _mm_storeh_pi: 4.1821ns -> 0.6171ns. JS SSE1 is 6.78x FASTER than JS scalar.        
JS _mm_storel_pi: 4.1821ns -> 0.6765ns. JS SSE1 is 6.18x FASTER than JS scalar.        
JS _mm_storer_ps: 4.1821ns -> 0.5501ns. JS SSE1 is 7.60x FASTER than JS scalar.        
JS _mm_storeu_ps: 4.1821ns -> 0.6605ns. JS SSE1 is 6.33x FASTER than JS scalar.        
JS _mm_stream_ps: 4.1821ns -> 0.6438ns. JS SSE1 is 6.50x FASTER than JS scalar.        

JS _mm_store_ps: JS scalar is 1.64x SLOWER than native scalar.        
JS _mm_store_ps1: JS scalar is 1.64x SLOWER than native scalar.        
JS _mm_store_ss: JS scalar is 1.64x SLOWER than native scalar.        
JS _mm_storeh_pi: JS scalar is 1.64x SLOWER than native scalar.        
JS _mm_storel_pi: JS scalar is 1.64x SLOWER than native scalar.        
JS _mm_storer_ps: JS scalar is 1.64x SLOWER than native scalar.        
JS _mm_storeu_ps: JS scalar is 1.64x SLOWER than native scalar.        
JS _mm_stream_ps: JS scalar is 1.64x SLOWER than native scalar.        

JS _mm_store_ps: JS SSE1 is 1.38x SLOWER than native SSE1.        
JS _mm_store_ps1: JS SSE1 is 2.08x SLOWER than native SSE1.        
JS _mm_store_ss: JS SSE1 is 1.16x SLOWER than native SSE1.        
JS _mm_storeh_pi: JS SSE1 is 1.24x SLOWER than native SSE1.        
JS _mm_storel_pi: JS SSE1 is 1.38x SLOWER than native SSE1.        
JS _mm_storer_ps: JS SSE1 is 1.09x SLOWER than native SSE1.        
JS _mm_storeu_ps: JS SSE1 is 1.27x SLOWER than native SSE1.        
JS _mm_stream_ps: JS SSE1 is 1.73x SLOWER than native SSE1.        

Native _mm_and_ps: 0.0985ns -> 0.0087ns. Native SSE1 is 11.31x FASTER than native scalar.        
Native _mm_andnot_ps: 0.0348ns -> 0.0087ns. Native SSE1 is 4.00x FASTER than native scalar.        
Native _mm_or_ps: 0.1026ns -> 0.0096ns. Native SSE1 is 10.66x FASTER than native scalar.        
Native _mm_xor_ps: 0.0174ns -> 0.0087ns. Native SSE1 is 2.00x FASTER than native scalar.        

JS _mm_and_ps: 0.2970ns -> 0.1559ns. JS SSE1 is 1.90x FASTER than JS scalar.        
JS _mm_andnot_ps: 0.4129ns -> 0.1475ns. JS SSE1 is 2.80x FASTER than JS scalar.        
JS _mm_or_ps: 0.4178ns -> 0.1022ns. JS SSE1 is 4.09x FASTER than JS scalar.        
JS _mm_xor_ps: 0.3062ns -> 0.1529ns. JS SSE1 is 2.00x FASTER than JS scalar.        

JS _mm_and_ps: JS scalar is 3.02x SLOWER than native scalar.        
JS _mm_andnot_ps: JS scalar is 11.86x SLOWER than native scalar.        
JS _mm_or_ps: JS scalar is 4.07x SLOWER than native scalar.        
JS _mm_xor_ps: JS scalar is 17.59x SLOWER than native scalar.        

JS _mm_and_ps: JS SSE1 is 17.91x SLOWER than native SSE1.        
JS _mm_andnot_ps: JS SSE1 is 16.94x SLOWER than native SSE1.        
JS _mm_or_ps: JS SSE1 is 10.62x SLOWER than native SSE1.        
JS _mm_xor_ps: JS SSE1 is 17.57x SLOWER than native SSE1.        

Native _mm_add_ps: 1.6951ns -> 0.3804ns. Native SSE1 is 4.46x FASTER than native scalar.        
Native _mm_add_ss: 1.6951ns -> 0.4444ns. Native SSE1 is 3.81x FASTER than native scalar.        
Native _mm_div_ps: 49.7539ns -> 13.4710ns. Native SSE1 is 3.69x FASTER than native scalar.        
Native _mm_div_ss: 49.7539ns -> 10.4374ns. Native SSE1 is 4.77x FASTER than native scalar.        
Native _mm_mul_ps: 1.4049ns -> 0.2784ns. Native SSE1 is 5.05x FASTER than native scalar.        
Native _mm_mul_ss: 1.4049ns -> 0.3466ns. Native SSE1 is 4.05x FASTER than native scalar.        
Native _mm_sub_ps: 1.3992ns -> 0.3481ns. Native SSE1 is 4.02x FASTER than native scalar.        
Native _mm_sub_ss: 1.3992ns -> 0.3210ns. Native SSE1 is 4.36x FASTER than native scalar.        

JS _mm_add_ps: 1.3235ns -> 0.3267ns. JS SSE1 is 4.05x FASTER than JS scalar.        
JS _mm_add_ss: 1.3235ns -> 0.4294ns. JS SSE1 is 3.08x FASTER than JS scalar.        
JS _mm_div_ps: 58.2693ns -> 13.8939ns. JS SSE1 is 4.19x FASTER than JS scalar.        
JS _mm_div_ss: 58.2693ns -> 14.0787ns. JS SSE1 is 4.14x FASTER than JS scalar.        
JS _mm_mul_ps: 1.5426ns -> 0.3649ns. JS SSE1 is 4.23x FASTER than JS scalar.        
JS _mm_mul_ss: 1.5426ns -> 0.3760ns. JS SSE1 is 4.10x FASTER than JS scalar.        
JS _mm_sub_ps: 2.3714ns -> 0.4689ns. JS SSE1 is 5.06x FASTER than JS scalar.        
JS _mm_sub_ss: 2.3714ns -> 0.3777ns. JS SSE1 is 6.28x FASTER than JS scalar.        

JS _mm_add_ps: JS scalar is 1.28x FASTER than native scalar.        
JS _mm_add_ss: JS scalar is 1.28x FASTER than native scalar.        
JS _mm_div_ps: JS scalar is 1.17x SLOWER than native scalar.        
JS _mm_div_ss: JS scalar is 1.17x SLOWER than native scalar.        
JS _mm_mul_ps: JS scalar is 1.10x SLOWER than native scalar.        
JS _mm_mul_ss: JS scalar is 1.10x SLOWER than native scalar.        
JS _mm_sub_ps: JS scalar is 1.69x SLOWER than native scalar.        
JS _mm_sub_ss: JS scalar is 1.69x SLOWER than native scalar.        

JS _mm_add_ps: JS SSE1 is 1.16x FASTER than native SSE1.        
JS _mm_add_ss: JS SSE1 is 1.03x FASTER than native SSE1.        
JS _mm_div_ps: JS SSE1 is 1.03x SLOWER than native SSE1.        
JS _mm_div_ss: JS SSE1 is 1.35x SLOWER than native SSE1.        
JS _mm_mul_ps: JS SSE1 is 1.31x SLOWER than native SSE1.        
JS _mm_mul_ss: JS SSE1 is 1.08x SLOWER than native SSE1.        
JS _mm_sub_ps: JS SSE1 is 1.35x SLOWER than native SSE1.        
JS _mm_sub_ss: JS SSE1 is 1.18x SLOWER than native SSE1.        

Native _mm_rcp_ps: 3.3810ns -> 0.2796ns. Native SSE1 is 12.09x FASTER than native scalar.        
Native _mm_rcp_ss: 3.3810ns -> 0.2784ns. Native SSE1 is 12.14x FASTER than native scalar.        
Native _mm_rsqrt_ps: 6.7520ns -> 0.2808ns. Native SSE1 is 24.05x FASTER than native scalar.        
Native _mm_rsqrt_ss: 6.7520ns -> 0.2784ns. Native SSE1 is 24.25x FASTER than native scalar.        
Native _mm_sqrt_ps: 3.3709ns -> 0.8392ns. Native SSE1 is 4.02x FASTER than native scalar.        
Native _mm_sqrt_ss: 3.3709ns -> 0.8738ns. Native SSE1 is 3.86x FASTER than native scalar.        

JS _mm_rcp_ps: 7.8851ns -> 0.8824ns. JS SSE1 is 8.94x FASTER than JS scalar.        
JS _mm_rcp_ss: 7.8851ns -> 1.5280ns. JS SSE1 is 5.16x FASTER than JS scalar.        
JS _mm_rsqrt_ps: 12.0885ns -> 1.9992ns. JS SSE1 is 6.05x FASTER than JS scalar.        
JS _mm_rsqrt_ss: 12.0885ns -> 2.0127ns. JS SSE1 is 6.01x FASTER than JS scalar.        
JS _mm_sqrt_ps: 4.0107ns -> 0.9371ns. JS SSE1 is 4.28x FASTER than JS scalar.        
JS _mm_sqrt_ss: 4.0107ns -> 1.0343ns. JS SSE1 is 3.88x FASTER than JS scalar.        

JS _mm_rcp_ps: JS scalar is 2.33x SLOWER than native scalar.        
JS _mm_rcp_ss: JS scalar is 2.33x SLOWER than native scalar.        
JS _mm_rsqrt_ps: JS scalar is 1.79x SLOWER than native scalar.        
JS _mm_rsqrt_ss: JS scalar is 1.79x SLOWER than native scalar.        
JS _mm_sqrt_ps: JS scalar is 1.19x SLOWER than native scalar.        
JS _mm_sqrt_ss: JS scalar is 1.19x SLOWER than native scalar.        

JS _mm_rcp_ps: JS SSE1 is 3.16x SLOWER than native SSE1.        
JS _mm_rcp_ss: JS SSE1 is 5.49x SLOWER than native SSE1.        
JS _mm_rsqrt_ps: JS SSE1 is 7.12x SLOWER than native SSE1.        
JS _mm_rsqrt_ss: JS SSE1 is 7.23x SLOWER than native SSE1.        
JS _mm_sqrt_ps: JS SSE1 is 1.12x SLOWER than native SSE1.        
JS _mm_sqrt_ss: JS SSE1 is 1.18x SLOWER than native SSE1.        

Native _mm_cmpeq_ps: 1.4005ns -> 0.2784ns. Native SSE1 is 5.03x FASTER than native scalar.        
Native _mm_cmpeq_ss: 1.4005ns -> 0.2807ns. Native SSE1 is 4.99x FASTER than native scalar.        
Native _mm_cmpge_ps: 1.4748ns -> 0.3522ns. Native SSE1 is 4.19x FASTER than native scalar.        
Native _mm_cmpge_ss: 1.4748ns -> 0.4625ns. Native SSE1 is 3.19x FASTER than native scalar.        
Native _mm_cmpgt_ps: 1.4268ns -> 0.3520ns. Native SSE1 is 4.05x FASTER than native scalar.        
Native _mm_cmpgt_ss: 1.4268ns -> 0.4711ns. Native SSE1 is 3.03x FASTER than native scalar.        
Native _mm_cmple_ps: 1.1979ns -> 0.2784ns. Native SSE1 is 4.30x FASTER than native scalar.        
Native _mm_cmple_ss: 1.1979ns -> 0.2785ns. Native SSE1 is 4.30x FASTER than native scalar.        
Native _mm_cmplt_ps: 1.2544ns -> 0.3368ns. Native SSE1 is 3.72x FASTER than native scalar.        
Native _mm_cmplt_ss: 1.2544ns -> 0.3176ns. Native SSE1 is 3.95x FASTER than native scalar.        
Native _mm_cmpord_ps: 2.5458ns -> 0.2789ns. Native SSE1 is 9.13x FASTER than native scalar.        
Native _mm_cmpord_ss: 2.5458ns -> 0.2784ns. Native SSE1 is 9.14x FASTER than native scalar.        
Native _mm_cmpunord_ps: 2.8224ns -> 0.2992ns. Native SSE1 is 9.43x FASTER than native scalar.        
Native _mm_cmpunord_ss: 2.8224ns -> 0.3090ns. Native SSE1 is 9.13x FASTER than native scalar.        

JS _mm_cmpeq_ps: 1.2527ns -> 0.4402ns. JS SSE1 is 2.85x FASTER than JS scalar.        
JS _mm_cmpeq_ss: 1.2527ns -> 0.3743ns. JS SSE1 is 3.35x FASTER than JS scalar.        
JS _mm_cmpge_ps: 1.4251ns -> 0.3436ns. JS SSE1 is 4.15x FASTER than JS scalar.        
JS _mm_cmpge_ss: 1.4251ns -> 0.3703ns. JS SSE1 is 3.85x FASTER than JS scalar.        
JS _mm_cmpgt_ps: 1.4572ns -> 0.3377ns. JS SSE1 is 4.32x FASTER than JS scalar.        
JS _mm_cmpgt_ss: 1.4572ns -> 0.4251ns. JS SSE1 is 3.43x FASTER than JS scalar.        
JS _mm_cmple_ps: 1.2556ns -> 0.3170ns. JS SSE1 is 3.96x FASTER than JS scalar.        
JS _mm_cmple_ss: 1.2556ns -> 0.3669ns. JS SSE1 is 3.42x FASTER than JS scalar.        
JS _mm_cmplt_ps: 1.1414ns -> 0.2787ns. JS SSE1 is 4.10x FASTER than JS scalar.        
JS _mm_cmplt_ss: 1.1414ns -> 0.3544ns. JS SSE1 is 3.22x FASTER than JS scalar.        
JS _mm_cmpord_ps: 3.8326ns -> 0.4539ns. JS SSE1 is 8.44x FASTER than JS scalar.        
JS _mm_cmpord_ss: 3.8326ns -> 0.4901ns. JS SSE1 is 7.82x FASTER than JS scalar.        
JS _mm_cmpunord_ps: 3.6389ns -> 0.4614ns. JS SSE1 is 7.89x FASTER than JS scalar.        
JS _mm_cmpunord_ss: 3.6389ns -> 0.4838ns. JS SSE1 is 7.52x FASTER than JS scalar.        

JS _mm_cmpeq_ps: JS scalar is 1.12x FASTER than native scalar.        
JS _mm_cmpeq_ss: JS scalar is 1.12x FASTER than native scalar.        
JS _mm_cmpge_ps: JS scalar is 1.03x FASTER than native scalar.        
JS _mm_cmpge_ss: JS scalar is 1.03x FASTER than native scalar.        
JS _mm_cmpgt_ps: JS scalar is 1.02x SLOWER than native scalar.        
JS _mm_cmpgt_ss: JS scalar is 1.02x SLOWER than native scalar.        
JS _mm_cmple_ps: JS scalar is 1.05x SLOWER than native scalar.        
JS _mm_cmple_ss: JS scalar is 1.05x SLOWER than native scalar.        
JS _mm_cmplt_ps: JS scalar is 1.10x FASTER than native scalar.        
JS _mm_cmplt_ss: JS scalar is 1.10x FASTER than native scalar.        
JS _mm_cmpord_ps: JS scalar is 1.51x SLOWER than native scalar.        
JS _mm_cmpord_ss: JS scalar is 1.51x SLOWER than native scalar.        
JS _mm_cmpunord_ps: JS scalar is 1.29x SLOWER than native scalar.        
JS _mm_cmpunord_ss: JS scalar is 1.29x SLOWER than native scalar.        

JS _mm_cmpeq_ps: JS SSE1 is 1.58x SLOWER than native SSE1.        
JS _mm_cmpeq_ss: JS SSE1 is 1.33x SLOWER than native SSE1.        
JS _mm_cmpge_ps: JS SSE1 is 1.03x FASTER than native SSE1.        
JS _mm_cmpge_ss: JS SSE1 is 1.25x FASTER than native SSE1.        
JS _mm_cmpgt_ps: JS SSE1 is 1.04x FASTER than native SSE1.        
JS _mm_cmpgt_ss: JS SSE1 is 1.11x FASTER than native SSE1.        
JS _mm_cmple_ps: JS SSE1 is 1.14x SLOWER than native SSE1.        
JS _mm_cmple_ss: JS SSE1 is 1.32x SLOWER than native SSE1.        
JS _mm_cmplt_ps: JS SSE1 is 1.21x FASTER than native SSE1.        
JS _mm_cmplt_ss: JS SSE1 is 1.12x SLOWER than native SSE1.        
JS _mm_cmpord_ps: JS SSE1 is 1.63x SLOWER than native SSE1.        
JS _mm_cmpord_ss: JS SSE1 is 1.76x SLOWER than native SSE1.        
JS _mm_cmpunord_ps: JS SSE1 is 1.54x SLOWER than native SSE1.        
JS _mm_cmpunord_ss: JS SSE1 is 1.57x SLOWER than native SSE1.