Debugging Nginx Cache Misses: Hitting high number of MISS despite high proxy valid

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Debugging Nginx Cache Misses: Hitting high number of MISS despite high proxy valid

Quintin Par
My proxy cache path is set to a very high size

proxy_cache_path  /var/lib/nginx/cache  levels=1:2   keys_zone=staticfilecache:180m  max_size=700m;
and the size used is only

sudo du -sh *
14M cache
4.0K    proxy
Proxy cache valid is set to

proxy_cache_valid 200 120d;
I track HIT and MISS via

add_header X-Cache-Status $upstream_cache_status;
Despite these settings I am seeing a lot of MISSes. And this is for pages I intentionally ran a cache warmer an hour ago.

How do I debug why these MISSes are happening? How do I find out if the miss was due to eviction, expiration, some rogue header etc? Does Nginx provide commands for this?

- Quintin

_______________________________________________
nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx
Reply | Threaded
Open this post in threaded view
|

Re: Debugging Nginx Cache Misses: Hitting high number of MISS despite high proxy valid

Lucas Rolff-2

It can be as simple as doing a curl to your “origin” url (the one you proxy_pass to) for the files you see that gets a lot of MISS’s – if there’s odd headers such as cookies etc, then you’ll most likely experience a bad cache if your nginx is configured to not ignore those headers.

 

From: nginx <[hidden email]> on behalf of Quintin Par <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]>
Date: Saturday, 12 May 2018 at 18.26
To: "[hidden email]" <[hidden email]>
Subject: Debugging Nginx Cache Misses: Hitting high number of MISS despite high proxy valid

 

https://mailtrack.io/trace/mail/86a613eb1ce46a4e7fa6f9eb96989cddae639800.png?u=74734

My proxy cache path is set to a very high size

 

proxy_cache_path  /var/lib/nginx/cache  levels=1:2   keys_zone=staticfilecache:180m  max_size=700m;

and the size used is only

 

sudo du -sh *

14M cache

4.0K    proxy

Proxy cache valid is set to

 

proxy_cache_valid 200 120d;

I track HIT and MISS via

 

add_header X-Cache-Status $upstream_cache_status;

Despite these settings I am seeing a lot of MISSes. And this is for pages I intentionally ran a cache warmer an hour ago.

 

How do I debug why these MISSes are happening? How do I find out if the miss was due to eviction, expiration, some rogue header etc? Does Nginx provide commands for this?

 

- Quintin


_______________________________________________
nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx
Reply | Threaded
Open this post in threaded view
|

Re: Debugging Nginx Cache Misses: Hitting high number of MISS despite high proxy valid

Quintin Par

That’s the tricky part. These MISSes are intermittent. Whenever I run curl I get HITs but I end up seeing a lot of MISS in the logs.

 

How do I log these MiSSes with the reason? I want to know what headers ended up bypassing the cache.

 

Here’s my caching config

 

            proxy_pass http://127.0.0.1:8000;

                proxy_set_header X-Real-IP  $remote_addr;

                proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

                proxy_set_header X-Forwarded-Proto https;

                proxy_set_header X-Forwarded-Port 443;

 

                # If logged in, don't cache.

                if ($http_cookie ~* "comment_author_|wordpress_(?!test_cookie)|wp-postpass_" ) {

                  set $do_not_cache 1;

                }

                proxy_cache_key "$scheme://$host$request_uri$do_not_cache";

                proxy_cache staticfilecache;

                add_header Cache-Control public;

                proxy_cache_valid       200 120d;

                proxy_hide_header "Set-Cookie";

                proxy_ignore_headers  "Set-Cookie";

                proxy_ignore_headers  "Cache-Control";

                proxy_hide_header "Cache-Control";

                proxy_pass_header X-Accel-Expires;

 

                proxy_set_header Accept-Encoding "";

                proxy_ignore_headers Expires;

                add_header X-Cache-Status $upstream_cache_status;

                proxy_cache_use_stale   timeout;

                proxy_cache_bypass $arg_nocache $do_not_cache;

- Quintin


On Sat, May 12, 2018 at 10:29 AM Lucas Rolff <[hidden email]> wrote:

It can be as simple as doing a curl to your “origin” url (the one you proxy_pass to) for the files you see that gets a lot of MISS’s – if there’s odd headers such as cookies etc, then you’ll most likely experience a bad cache if your nginx is configured to not ignore those headers.

 

From: nginx <[hidden email]> on behalf of Quintin Par <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]>
Date: Saturday, 12 May 2018 at 18.26
To: "[hidden email]" <[hidden email]>
Subject: Debugging Nginx Cache Misses: Hitting high number of MISS despite high proxy valid

 

https://mailtrack.io/trace/mail/86a613eb1ce46a4e7fa6f9eb96989cddae639800.png?u=74734

My proxy cache path is set to a very high size

 

proxy_cache_path  /var/lib/nginx/cache  levels=1:2   keys_zone=staticfilecache:180m  max_size=700m;

and the size used is only

 

sudo du -sh *

14M cache

4.0K    proxy

Proxy cache valid is set to

 

proxy_cache_valid 200 120d;

I track HIT and MISS via

 

add_header X-Cache-Status $upstream_cache_status;

Despite these settings I am seeing a lot of MISSes. And this is for pages I intentionally ran a cache warmer an hour ago.

 

How do I debug why these MISSes are happening? How do I find out if the miss was due to eviction, expiration, some rogue header etc? Does Nginx provide commands for this?

 

- Quintin

_______________________________________________
nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx

_______________________________________________
nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx
Reply | Threaded
Open this post in threaded view
|

Re: Debugging Nginx Cache Misses: Hitting high number of MISS despite high proxy valid

Friscia, Michael

I'm not sure if this will help, but I ignore/hide a lot, this is in my config


proxy_ignore_headers X-Accel-Expires Expires Cache-Control Set-Cookie;
proxy_hide_header X-Accel-Expires;
proxy_hide_header Pragma;
proxy_hide_header Server;
proxy_hide_header Request-Context;
proxy_hide_header X-Powered-By;
proxy_hide_header X-AspNet-Version;
proxy_hide_header X-AspNetMvc-Version;


I have not experienced the problem you mention, I just thought I would offer my config.


___________________________________________

Michael Friscia

Office of Communications

Yale School of Medicine

(203) 737-7932 – office

(203) 931-5381 – mobile

http://web.yale.edu




From: nginx <[hidden email]> on behalf of Quintin Par <[hidden email]>
Sent: Saturday, May 12, 2018 1:32 PM
To: [hidden email]
Subject: Re: Debugging Nginx Cache Misses: Hitting high number of MISS despite high proxy valid
 

That’s the tricky part. These MISSes are intermittent. Whenever I run curl I get HITs but I end up seeing a lot of MISS in the logs.

 

How do I log these MiSSes with the reason? I want to know what headers ended up bypassing the cache.

 

Here’s my caching config

 

            proxy_pass http://127.0.0.1:8000;

                proxy_set_header X-Real-IP  $remote_addr;

                proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

                proxy_set_header X-Forwarded-Proto https;

                proxy_set_header X-Forwarded-Port 443;

 

                # If logged in, don't cache.

                if ($http_cookie ~* "comment_author_|wordpress_(?!test_cookie)|wp-postpass_" ) {

                  set $do_not_cache 1;

                }

                proxy_cache_key "$scheme://$host$request_uri$do_not_cache";

                proxy_cache staticfilecache;

                add_header Cache-Control public;

                proxy_cache_valid       200 120d;

                proxy_hide_header "Set-Cookie";

                proxy_ignore_headers  "Set-Cookie";

                proxy_ignore_headers  "Cache-Control";

                proxy_hide_header "Cache-Control";

                proxy_pass_header X-Accel-Expires;

 

                proxy_set_header Accept-Encoding "";

                proxy_ignore_headers Expires;

                add_header X-Cache-Status $upstream_cache_status;

                proxy_cache_use_stale   timeout;

                proxy_cache_bypass $arg_nocache $do_not_cache;

- Quintin


On Sat, May 12, 2018 at 10:29 AM Lucas Rolff <[hidden email]> wrote:

It can be as simple as doing a curl to your “origin” url (the one you proxy_pass to) for the files you see that gets a lot of MISS’s – if there’s odd headers such as cookies etc, then you’ll most likely experience a bad cache if your nginx is configured to not ignore those headers.

 

From: nginx <[hidden email]> on behalf of Quintin Par <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]>
Date: Saturday, 12 May 2018 at 18.26
To: "[hidden email]" <[hidden email]>
Subject: Debugging Nginx Cache Misses: Hitting high number of MISS despite high proxy valid

 

https://mailtrack.io/trace/mail/86a613eb1ce46a4e7fa6f9eb96989cddae639800.png?u=74734

My proxy cache path is set to a very high size

 

proxy_cache_path  /var/lib/nginx/cache  levels=1:2   keys_zone=staticfilecache:180m  max_size=700m;

and the size used is only

 

sudo du -sh *

14M cache

4.0K    proxy

Proxy cache valid is set to

 

proxy_cache_valid 200 120d;

I track HIT and MISS via

 

add_header X-Cache-Status $upstream_cache_status;

Despite these settings I am seeing a lot of MISSes. And this is for pages I intentionally ran a cache warmer an hour ago.

 

How do I debug why these MISSes are happening? How do I find out if the miss was due to eviction, expiration, some rogue header etc? Does Nginx provide commands for this?

 

- Quintin

_______________________________________________
nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx

_______________________________________________
nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx
Reply | Threaded
Open this post in threaded view
|

Re: Debugging Nginx Cache Misses: Hitting high number of MISS despite high proxy valid

Cherian Thomas

Thanks for this Michael.

 

This is so surprising. If someone decides to Dos and crawls the website with a rogue header, this will essentially bypass the cache and put a strain on the website. In fact, I was hit by a dos attack that’s when I started looking at logs and realized the large number of MISSes.

 

Can someone please help?



- Cherian

On Sat, May 12, 2018 at 12:01 PM, Friscia, Michael <[hidden email]> wrote:

I'm not sure if this will help, but I ignore/hide a lot, this is in my config


proxy_ignore_headers X-Accel-Expires Expires Cache-Control Set-Cookie;
proxy_hide_header X-Accel-Expires;
proxy_hide_header Pragma;
proxy_hide_header Server;
proxy_hide_header Request-Context;
proxy_hide_header X-Powered-By;
proxy_hide_header X-AspNet-Version;
proxy_hide_header X-AspNetMvc-Version;


I have not experienced the problem you mention, I just thought I would offer my config.


___________________________________________

Michael Friscia

Office of Communications

Yale School of Medicine

(203) 737-7932 – office

(203) 931-5381 – mobile

http://web.yale.edu




From: nginx <[hidden email]> on behalf of Quintin Par <[hidden email]>
Sent: Saturday, May 12, 2018 1:32 PM
To: [hidden email]
Subject: Re: Debugging Nginx Cache Misses: Hitting high number of MISS despite high proxy valid
 
<img class="m_-267008425796858038x_mailtrack-img" alt="" style="display:flex" width="0" height="0" src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7">

That’s the tricky part. These MISSes are intermittent. Whenever I run curl I get HITs but I end up seeing a lot of MISS in the logs.

 

How do I log these MiSSes with the reason? I want to know what headers ended up bypassing the cache.

 

Here’s my caching config

 

            proxy_pass http://127.0.0.1:8000;

                proxy_set_header X-Real-IP  $remote_addr;

                proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

                proxy_set_header X-Forwarded-Proto https;

                proxy_set_header X-Forwarded-Port 443;

 

                # If logged in, don't cache.

                if ($http_cookie ~* "comment_author_|wordpress_(?!test_cookie)|wp-postpass_" ) {

                  set $do_not_cache 1;

                }

                proxy_cache_key "$scheme://$host$request_uri$do_not_cache";

                proxy_cache staticfilecache;

                add_header Cache-Control public;

                proxy_cache_valid       200 120d;

                proxy_hide_header "Set-Cookie";

                proxy_ignore_headers  "Set-Cookie";

                proxy_ignore_headers  "Cache-Control";

                proxy_hide_header "Cache-Control";

                proxy_pass_header X-Accel-Expires;

 

                proxy_set_header Accept-Encoding "";

                proxy_ignore_headers Expires;

                add_header X-Cache-Status $upstream_cache_status;

                proxy_cache_use_stale   timeout;

                proxy_cache_bypass $arg_nocache $do_not_cache;

- Quintin


On Sat, May 12, 2018 at 10:29 AM Lucas Rolff <[hidden email]> wrote:

It can be as simple as doing a curl to your “origin” url (the one you proxy_pass to) for the files you see that gets a lot of MISS’s – if there’s odd headers such as cookies etc, then you’ll most likely experience a bad cache if your nginx is configured to not ignore those headers.

 

From: nginx <[hidden email]> on behalf of Quintin Par <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]>
Date: Saturday, 12 May 2018 at 18.26
To: "[hidden email]" <[hidden email]>
Subject: Debugging Nginx Cache Misses: Hitting high number of MISS despite high proxy valid

 

<img id="m_-267008425796858038x_m_5039261820505189399_x0000_i1025" alt="https://mailtrack.io/trace/mail/86a613eb1ce46a4e7fa6f9eb96989cddae639800.png?u=74734" src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7">

My proxy cache path is set to a very high size

 

proxy_cache_path  /var/lib/nginx/cache  levels=1:2   keys_zone=staticfilecache:180m  max_size=700m;

and the size used is only

 

sudo du -sh *

14M cache

4.0K    proxy

Proxy cache valid is set to

 

proxy_cache_valid 200 120d;

I track HIT and MISS via

 

add_header X-Cache-Status $upstream_cache_status;

Despite these settings I am seeing a lot of MISSes. And this is for pages I intentionally ran a cache warmer an hour ago.

 

How do I debug why these MISSes are happening? How do I find out if the miss was due to eviction, expiration, some rogue header etc? Does Nginx provide commands for this?

 

- Quintin

_______________________________________________
nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx

_______________________________________________
nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx


_______________________________________________
nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx
Reply | Threaded
Open this post in threaded view
|

Re: Debugging Nginx Cache Misses: Hitting high number of MISS despite high proxy valid

allenhe
You know you can DoS sites with Cache MISS via switching up URL params and
arguements.

Examples :

HIT :
index.php?var1=one&var2=two
MISS :
index.php?var2=two&var1=one

MISS :
index.php?random=1
index.php?random=2
index.php?random=3
etc etc

Inserting random arguements to URL's will cause cache misses and changing
the order of existing valid URL arguements will also cause misses.

Cherian Thomas Wrote:
-------------------------------------------------------

> Thanks for this Michael.
>
>
>
> This is so surprising. If someone decides to Dos and crawls the
> website
> with a rogue header, this will essentially bypass the cache and put a
> strain on the website. In fact, I was hit by a dos attack that’s when
> I
> started looking at logs and realized the large number of MISSes.
>
>
>
> Can someone please help?
>
>
> - Cherian
>
> On Sat, May 12, 2018 at 12:01 PM, Friscia, Michael
> <[hidden email]
> > wrote:
>
> > I'm not sure if this will help, but I ignore/hide a lot, this is in
> my
> > config
> >
> >
> > proxy_ignore_headers X-Accel-Expires Expires Cache-Control
> Set-Cookie;
> > proxy_hide_header X-Accel-Expires;
> > proxy_hide_header Pragma;
> > proxy_hide_header Server;
> > proxy_hide_header Request-Context;
> > proxy_hide_header X-Powered-By;
> > proxy_hide_header X-AspNet-Version;
> > proxy_hide_header X-AspNetMvc-Version;
> >
> >
> > I have not experienced the problem you mention, I just thought I
> would
> > offer my config.
> >
> >
> > ___________________________________________
> >
> > Michael Friscia
> >
> > Office of Communications
> >
> > Yale School of Medicine
> >
> > (203) 737-7932 – office
> >
> > (203) 931-5381 – mobile
> >
> > http://web.yale.edu
> >
> <https://mailtrack.io/trace/link/8357a0bdd8c40c2ff5b7d91c7797cbc7a8535
> ffb?url=http%3A%2F%2Fweb.yale.edu%2F&userId=74734&signature=d652edf1f4
> f21323>
> >
> >
> > ------------------------------
> > *From:* nginx <[hidden email]> on behalf of Quintin Par <
> > [hidden email]>
> > *Sent:* Saturday, May 12, 2018 1:32 PM
> > *To:* [hidden email]
> > *Subject:* Re: Debugging Nginx Cache Misses: Hitting high number of
> MISS
> > despite high proxy valid
> >
> >
> > That’s the tricky part. These MISSes are intermittent. Whenever I
> run curl
> > I get HITs but I end up seeing a lot of MISS in the logs.
> >
> >
> >
> > How do I log these MiSSes with the reason? I want to know what
> headers
> > ended up bypassing the cache.
> >
> >
> >
> > Here’s my caching config
> >
> >
> >
> >             proxy_pass http://127.0.0.1:8000
> >
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__127.0.0.1-3A8000&
> d=DwMFaQ&c=cjytLXgP8ixuoHflwc-poQ&r=wvXEDjvtDPcv7AlldT5UvDx32KXBEM6um_
> lS023SJrs&m=F-qGMOyS74uE8JM-dOLmNH92bQ1xQ-7Rj1d6k-_WST4&s=NHvlb1WColNw
> TWBF36P1whJdu5iWHK9_50IDHugaEdQ&e=>
> > ;
> >
> >                 proxy_set_header X-Real-IP  $remote_addr;
> >
> >                 proxy_set_header X-Forwarded-For
> > $proxy_add_x_forwarded_for;
> >
> >                 proxy_set_header X-Forwarded-Proto https;
> >
> >                 proxy_set_header X-Forwarded-Port 443;
> >
> >
> >
> >                 # If logged in, don't cache.
> >
> >                 if ($http_cookie ~*
> "comment_author_|wordpress_(?!test_cookie)|wp-postpass_"
> > ) {
> >
> >                   set $do_not_cache 1;
> >
> >                 }
> >
> >                 proxy_cache_key "$scheme://$host$request_uri$
> > do_not_cache";
> >
> >                 proxy_cache staticfilecache;
> >
> >                 add_header Cache-Control public;
> >
> >                 proxy_cache_valid       200 120d;
> >
> >                 proxy_hide_header "Set-Cookie";
> >
> >                 proxy_ignore_headers  "Set-Cookie";
> >
> >                 proxy_ignore_headers  "Cache-Control";
> >
> >                 proxy_hide_header "Cache-Control";
> >
> >                 proxy_pass_header X-Accel-Expires;
> >
> >
> >
> >                 proxy_set_header Accept-Encoding "";
> >
> >                 proxy_ignore_headers Expires;
> >
> >                 add_header X-Cache-Status $upstream_cache_status;
> >
> >                 proxy_cache_use_stale   timeout;
> >
> >                 proxy_cache_bypass $arg_nocache $do_not_cache;
> > - Quintin
> >
> >
> > On Sat, May 12, 2018 at 10:29 AM Lucas Rolff <[hidden email]>
> wrote:
> >
> > It can be as simple as doing a curl to your “origin” url (the one
> you
> > proxy_pass to) for the files you see that gets a lot of MISS’s – if
> there’s
> > odd headers such as cookies etc, then you’ll most likely experience
> a bad
> > cache if your nginx is configured to not ignore those headers.
> >
> >
> >
> > *From: *nginx <[hidden email]> on behalf of Quintin Par <
> > [hidden email]>
> > *Reply-To: *"[hidden email]" <[hidden email]>
> > *Date: *Saturday, 12 May 2018 at 18.26
> > *To: *"[hidden email]" <[hidden email]>
> > *Subject: *Debugging Nginx Cache Misses: Hitting high number of MISS
> > despite high proxy valid
> >
> >
> >
> > [image:
> >
> https://mailtrack.io/trace/mail/86a613eb1ce46a4e7fa6f9eb96989cddae6398
> 00.png?u=74734]
> >
> > My proxy cache path is set to a very high size
> >
> >
> >
> > proxy_cache_path  /var/lib/nginx/cache  levels=1:2
> >  keys_zone=staticfilecache:180m  max_size=700m;
> >
> > and the size used is only
> >
> >
> >
> > sudo du -sh *
> >
> > 14M cache
> >
> > 4.0K    proxy
> >
> > Proxy cache valid is set to
> >
> >
> >
> > proxy_cache_valid 200 120d;
> >
> > I track HIT and MISS via
> >
> >
> >
> > add_header X-Cache-Status $upstream_cache_status;
> >
> > Despite these settings I am seeing a lot of MISSes. And this is for
> pages
> > I intentionally ran a cache warmer an hour ago.
> >
> >
> >
> > How do I debug why these MISSes are happening? How do I find out if
> the
> > miss was due to eviction, expiration, some rogue header etc? Does
> Nginx
> > provide commands for this?
> >
> >
> >
> > - Quintin
> > _______________________________________________
> > nginx mailing list
> > [hidden email]
> > http://mailman.nginx.org/mailman/listinfo/nginx
> >
> <https://mailtrack.io/trace/link/122c3dbd333c388f47f5c2776af9ebc3fc75a
> e10?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-
> 3A__mailman.nginx.org_mailman_listinfo_nginx%26d%3DDwMFaQ%26c%3DcjytLX
> gP8ixuoHflwc-poQ%26r%3DwvXEDjvtDPcv7AlldT5UvDx32KXBEM6um_lS023SJrs%26m
> %3DF-qGMOyS74uE8JM-dOLmNH92bQ1xQ-7Rj1d6k-_WST4%26s%3DD3LnZhfobOtlEStCv
> CDrcwmHydEHaGRFC4gnWvRT5Uk%26e%3D&userId=74734&signature=56c7a7ad18b2c
> 057>
> >
> >
> > _______________________________________________
> > nginx mailing list
> > [hidden email]
> > http://mailman.nginx.org/mailman/listinfo/nginx
> >
> <https://mailtrack.io/trace/link/92c2700d67bd6891ca1606e2df4e0f11c6d82
> 260?url=http%3A%2F%2Fmailman.nginx.org%2Fmailman%2Flistinfo%2Fnginx&us
> erId=74734&signature=3763121afa828bb7>
> >
> _______________________________________________
> nginx mailing list
> [hidden email]
> http://mailman.nginx.org/mailman/listinfo/nginx

Posted at Nginx Forum: https://forum.nginx.org/read.php?2,279764,279771#msg-279771

_______________________________________________
nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx
Reply | Threaded
Open this post in threaded view
|

Re: Debugging Nginx Cache Misses: Hitting high number of MISS despite high proxy valid

Quintin Par

Thanks all for the response. Michael, I am going to add those header ignores.

 

Still puzzled by the large number of MISSEs and I’ve no clue why they are happening. Leads appreciated.

 

 



- Quintin

On Sun, May 13, 2018 at 6:12 PM, c0nw0nk <[hidden email]> wrote:
You know you can DoS sites with Cache MISS via switching up URL params and
arguements.

Examples :

HIT :
index.php?var1=one&var2=two
MISS :
index.php?var2=two&var1=one

MISS :
index.php?random=1
index.php?random=2
index.php?random=3
etc etc

Inserting random arguements to URL's will cause cache misses and changing
the order of existing valid URL arguements will also cause misses.

Cherian Thomas Wrote:
-------------------------------------------------------
> Thanks for this Michael.
>
>
>
> This is so surprising. If someone decides to Dos and crawls the
> website
> with a rogue header, this will essentially bypass the cache and put a
> strain on the website. In fact, I was hit by a dos attack that’s when
> I
> started looking at logs and realized the large number of MISSes.
>
>
>
> Can someone please help?
>
>
> - Cherian
>
> On Sat, May 12, 2018 at 12:01 PM, Friscia, Michael
> <[hidden email]
> > wrote:
>
> > I'm not sure if this will help, but I ignore/hide a lot, this is in
> my
> > config
> >
> >
> > proxy_ignore_headers X-Accel-Expires Expires Cache-Control
> Set-Cookie;
> > proxy_hide_header X-Accel-Expires;
> > proxy_hide_header Pragma;
> > proxy_hide_header Server;
> > proxy_hide_header Request-Context;
> > proxy_hide_header X-Powered-By;
> > proxy_hide_header X-AspNet-Version;
> > proxy_hide_header X-AspNetMvc-Version;
> >
> >
> > I have not experienced the problem you mention, I just thought I
> would
> > offer my config.
> >
> >
> > ___________________________________________
> >
> > Michael Friscia
> >
> > Office of Communications
> >
> > Yale School of Medicine
> >
> > (203) 737-7932 – office
> >
> > (203) 931-5381 – mobile
> >
> > http://web.yale.edu
> >
> <https://mailtrack.io/trace/link/8357a0bdd8c40c2ff5b7d91c7797cbc7a8535
> ffb?url=http%3A%2F%2Fweb.yale.edu%2F&userId=74734&signature=d652edf1f4
> f21323>
> >
> >
> > ------------------------------
> > *From:* nginx <[hidden email]> on behalf of Quintin Par <
> > [hidden email]>
> > *Sent:* Saturday, May 12, 2018 1:32 PM
> > *To:* [hidden email]
> > *Subject:* Re: Debugging Nginx Cache Misses: Hitting high number of
> MISS
> > despite high proxy valid
> >
> >
> > That’s the tricky part. These MISSes are intermittent. Whenever I
> run curl
> > I get HITs but I end up seeing a lot of MISS in the logs.
> >
> >
> >
> > How do I log these MiSSes with the reason? I want to know what
> headers
> > ended up bypassing the cache.
> >
> >
> >
> > Here’s my caching config
> >
> >
> >
> >             proxy_pass http://127.0.0.1:8000
> >
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__127.0.0.1-3A8000&
> d=DwMFaQ&c=cjytLXgP8ixuoHflwc-poQ&r=wvXEDjvtDPcv7AlldT5UvDx32KXBEM6um_
> lS023SJrs&m=F-qGMOyS74uE8JM-dOLmNH92bQ1xQ-7Rj1d6k-_WST4&s=NHvlb1WColNw
> TWBF36P1whJdu5iWHK9_50IDHugaEdQ&e=>
> > ;
> >
> >                 proxy_set_header X-Real-IP  $remote_addr;
> >
> >                 proxy_set_header X-Forwarded-For
> > $proxy_add_x_forwarded_for;
> >
> >                 proxy_set_header X-Forwarded-Proto https;
> >
> >                 proxy_set_header X-Forwarded-Port 443;
> >
> >
> >
> >                 # If logged in, don't cache.
> >
> >                 if ($http_cookie ~*
> "comment_author_|wordpress_(?!test_cookie)|wp-postpass_"
> > ) {
> >
> >                   set $do_not_cache 1;
> >
> >                 }
> >
> >                 proxy_cache_key "$scheme://$host$request_uri$
> > do_not_cache";
> >
> >                 proxy_cache staticfilecache;
> >
> >                 add_header Cache-Control public;
> >
> >                 proxy_cache_valid       200 120d;
> >
> >                 proxy_hide_header "Set-Cookie";
> >
> >                 proxy_ignore_headers  "Set-Cookie";
> >
> >                 proxy_ignore_headers  "Cache-Control";
> >
> >                 proxy_hide_header "Cache-Control";
> >
> >                 proxy_pass_header X-Accel-Expires;
> >
> >
> >
> >                 proxy_set_header Accept-Encoding "";
> >
> >                 proxy_ignore_headers Expires;
> >
> >                 add_header X-Cache-Status $upstream_cache_status;
> >
> >                 proxy_cache_use_stale   timeout;
> >
> >                 proxy_cache_bypass $arg_nocache $do_not_cache;
> > - Quintin
> >
> >
> > On Sat, May 12, 2018 at 10:29 AM Lucas Rolff <[hidden email]>
> wrote:
> >
> > It can be as simple as doing a curl to your “origin” url (the one
> you
> > proxy_pass to) for the files you see that gets a lot of MISS’s – if
> there’s
> > odd headers such as cookies etc, then you’ll most likely experience
> a bad
> > cache if your nginx is configured to not ignore those headers.
> >
> >
> >
> > *From: *nginx <[hidden email]> on behalf of Quintin Par <
> > [hidden email]>
> > *Reply-To: *"[hidden email]" <[hidden email]>
> > *Date: *Saturday, 12 May 2018 at 18.26
> > *To: *"[hidden email]" <[hidden email]>
> > *Subject: *Debugging Nginx Cache Misses: Hitting high number of MISS
> > despite high proxy valid
> >
> >
> >
> > [image:
> >
> https://mailtrack.io/trace/mail/86a613eb1ce46a4e7fa6f9eb96989cddae6398
> 00.png?u=74734]
> >
> > My proxy cache path is set to a very high size
> >
> >
> >
> > proxy_cache_path  /var/lib/nginx/cache  levels=1:2
> >  keys_zone=staticfilecache:180m  max_size=700m;
> >
> > and the size used is only
> >
> >
> >
> > sudo du -sh *
> >
> > 14M cache
> >
> > 4.0K    proxy
> >
> > Proxy cache valid is set to
> >
> >
> >
> > proxy_cache_valid 200 120d;
> >
> > I track HIT and MISS via
> >
> >
> >
> > add_header X-Cache-Status $upstream_cache_status;
> >
> > Despite these settings I am seeing a lot of MISSes. And this is for
> pages
> > I intentionally ran a cache warmer an hour ago.
> >
> >
> >
> > How do I debug why these MISSes are happening? How do I find out if
> the
> > miss was due to eviction, expiration, some rogue header etc? Does
> Nginx
> > provide commands for this?
> >
> >
> >
> > - Quintin
> > _______________________________________________
> > nginx mailing list
> > [hidden email]
> > http://mailman.nginx.org/mailman/listinfo/nginx
> >
> <https://mailtrack.io/trace/link/122c3dbd333c388f47f5c2776af9ebc3fc75a
> e10?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-
> 3A__mailman.nginx.org_mailman_listinfo_nginx%26d%3DDwMFaQ%26c%3DcjytLX
> gP8ixuoHflwc-poQ%26r%3DwvXEDjvtDPcv7AlldT5UvDx32KXBEM6um_lS023SJrs%26m
> %3DF-qGMOyS74uE8JM-dOLmNH92bQ1xQ-7Rj1d6k-_WST4%26s%3DD3LnZhfobOtlEStCv
> CDrcwmHydEHaGRFC4gnWvRT5Uk%26e%3D&userId=74734&signature=56c7a7ad18b2c
> 057>
> >
> >
> > _______________________________________________
> > nginx mailing list
> > [hidden email]
> > http://mailman.nginx.org/mailman/listinfo/nginx
> >
> <https://mailtrack.io/trace/link/92c2700d67bd6891ca1606e2df4e0f11c6d82
> 260?url=http%3A%2F%2Fmailman.nginx.org%2Fmailman%2Flistinfo%2Fnginx&us
> erId=74734&signature=3763121afa828bb7>
> >
> _______________________________________________
> nginx mailing list
> [hidden email]
> http://mailman.nginx.org/mailman/listinfo/nginx

Posted at Nginx Forum: https://forum.nginx.org/read.php?2,279764,279771#msg-279771

_______________________________________________
nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx


_______________________________________________
nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx
Reply | Threaded
Open this post in threaded view
|

Re: Debugging Nginx Cache Misses: Hitting high number of MISS despite high proxy valid

Friscia, Michael

I wish I had a lead for you. I’ve never seen that behavoir

 

___________________________________________

Michael Friscia

Office of Communications

Yale School of Medicine

(203) 737-7932 - office

(203) 931-5381 - mobile

http://web.yale.edu

 

From: nginx <[hidden email]> on behalf of Quintin Par <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]>
Date: Monday, May 14, 2018 at 12:07 AM
To: "[hidden email]" <[hidden email]>
Subject: Re: Debugging Nginx Cache Misses: Hitting high number of MISS despite high proxy valid

 

https://mailtrack.io/trace/mail/830e676b314f1b30986adfc1c7df5f967b9aa282.png?u=74734

Thanks all for the response. Michael, I am going to add those header ignores.

 

Still puzzled by the large number of MISSEs and I’ve no clue why they are happening. Leads appreciated.

 

 

 


- Quintin

 

On Sun, May 13, 2018 at 6:12 PM, c0nw0nk <[hidden email]> wrote:

You know you can DoS sites with Cache MISS via switching up URL params and
arguements.

Examples :

HIT :
index.php?var1=one&var2=two
MISS :
index.php?var2=two&var1=one

MISS :
index.php?random=1
index.php?random=2
index.php?random=3
etc etc

Inserting random arguements to URL's will cause cache misses and changing
the order of existing valid URL arguements will also cause misses.

Cherian Thomas Wrote:
-------------------------------------------------------

> Thanks for this Michael.
>
>
>
> This is so surprising. If someone decides to Dos and crawls the
> website
> with a rogue header, this will essentially bypass the cache and put a
> strain on the website. In fact, I was hit by a dos attack that’s when
> I
> started looking at logs and realized the large number of MISSes.
>
>
>
> Can someone please help?
>
>
> - Cherian
>
> On Sat, May 12, 2018 at 12:01 PM, Friscia, Michael
> <
[hidden email]
> > wrote:
>
> > I'm not sure if this will help, but I ignore/hide a lot, this is in
> my
> > config
> >
> >
> > proxy_ignore_headers X-Accel-Expires Expires Cache-Control
> Set-Cookie;
> > proxy_hide_header X-Accel-Expires;
> > proxy_hide_header Pragma;
> > proxy_hide_header Server;
> > proxy_hide_header Request-Context;
> > proxy_hide_header X-Powered-By;
> > proxy_hide_header X-AspNet-Version;
> > proxy_hide_header X-AspNetMvc-Version;
> >
> >
> > I have not experienced the problem you mention, I just thought I
> would
> > offer my config.
> >
> >
> > ___________________________________________
> >
> > Michael Friscia
> >
> > Office of Communications
> >
> > Yale School of Medicine
> >
> > (203) 737-7932 – office
> >
> > (203) 931-5381 – mobile
> >
> >
http://web.yale.edu
> >

> <https://mailtrack.io/trace/link/8357a0bdd8c40c2ff5b7d91c7797cbc7a8535
> ffb?url=http%3A%2F%
2Fweb.yale.edu%2F&userId=74734&signature=d652edf1f4
> f21323>
> >
> >
> > ------------------------------
> > *From:* nginx <
[hidden email]> on behalf of Quintin Par <
> >
[hidden email]>
> > *Sent:* Saturday, May 12, 2018 1:32 PM
> > *To:*
[hidden email]
> > *Subject:* Re: Debugging Nginx Cache Misses: Hitting high number of
> MISS
> > despite high proxy valid
> >
> >
> > That’s the tricky part. These MISSes are intermittent. Whenever I
> run curl
> > I get HITs but I end up seeing a lot of MISS in the logs.
> >
> >
> >
> > How do I log these MiSSes with the reason? I want to know what
> headers
> > ended up bypassing the cache.
> >
> >
> >
> > Here’s my caching config
> >
> >
> >
> >             proxy_pass
http://127.0.0.1:8000
> >
> <
https://urldefense.proofpoint.com/v2/url?u=http-3A__127.0.0.1-3A8000&
> d=DwMFaQ&c=cjytLXgP8ixuoHflwc-poQ&r=wvXEDjvtDPcv7AlldT5UvDx32KXBEM6um_
> lS023SJrs&m=F-qGMOyS74uE8JM-dOLmNH92bQ1xQ-7Rj1d6k-_WST4&s=NHvlb1WColNw
> TWBF36P1whJdu5iWHK9_50IDHugaEdQ&e=>

> > ;
> >
> >                 proxy_set_header X-Real-IP  $remote_addr;
> >
> >                 proxy_set_header X-Forwarded-For
> > $proxy_add_x_forwarded_for;
> >
> >                 proxy_set_header X-Forwarded-Proto https;
> >
> >                 proxy_set_header X-Forwarded-Port 443;
> >
> >
> >
> >                 # If logged in, don't cache.
> >
> >                 if ($http_cookie ~*
> "comment_author_|wordpress_(?!test_cookie)|wp-postpass_"
> > ) {
> >
> >                   set $do_not_cache 1;
> >
> >                 }
> >
> >                 proxy_cache_key "$scheme://$host$request_uri$
> > do_not_cache";
> >
> >                 proxy_cache staticfilecache;
> >
> >                 add_header Cache-Control public;
> >
> >                 proxy_cache_valid       200 120d;
> >
> >                 proxy_hide_header "Set-Cookie";
> >
> >                 proxy_ignore_headers  "Set-Cookie";
> >
> >                 proxy_ignore_headers  "Cache-Control";
> >
> >                 proxy_hide_header "Cache-Control";
> >
> >                 proxy_pass_header X-Accel-Expires;
> >
> >
> >
> >                 proxy_set_header Accept-Encoding "";
> >
> >                 proxy_ignore_headers Expires;
> >
> >                 add_header X-Cache-Status $upstream_cache_status;
> >
> >                 proxy_cache_use_stale   timeout;
> >
> >                 proxy_cache_bypass $arg_nocache $do_not_cache;
> > - Quintin
> >
> >
> > On Sat, May 12, 2018 at 10:29 AM Lucas Rolff <
[hidden email]>
> wrote:
> >
> > It can be as simple as doing a curl to your “origin” url (the one
> you
> > proxy_pass to) for the files you see that gets a lot of MISS’s – if
> there’s
> > odd headers such as cookies etc, then you’ll most likely experience
> a bad
> > cache if your nginx is configured to not ignore those headers.
> >
> >
> >

> > *From: *nginx <[hidden email]> on behalf of Quintin Par <
> >
[hidden email]>
> > *Reply-To: *"
[hidden email]" <[hidden email]>
> > *Date: *Saturday, 12 May 2018 at 18.26
> > *To: *"
[hidden email]" <[hidden email]>
> > *Subject: *Debugging Nginx Cache Misses: Hitting high number of MISS

> > despite high proxy valid
> >
> >
> >
> > [image:
> >
>
https://mailtrack.io/trace/mail/86a613eb1ce46a4e7fa6f9eb96989cddae6398
> 00.png?u=74734]
> >
> > My proxy cache path is set to a very high size
> >
> >
> >
> > proxy_cache_path  /var/lib/nginx/cache  levels=1:2
> >  keys_zone=staticfilecache:180m  max_size=700m;
> >
> > and the size used is only
> >
> >
> >
> > sudo du -sh *
> >
> > 14M cache
> >
> > 4.0K    proxy
> >
> > Proxy cache valid is set to
> >
> >
> >
> > proxy_cache_valid 200 120d;
> >
> > I track HIT and MISS via
> >
> >
> >
> > add_header X-Cache-Status $upstream_cache_status;
> >
> > Despite these settings I am seeing a lot of MISSes. And this is for
> pages
> > I intentionally ran a cache warmer an hour ago.
> >
> >
> >
> > How do I debug why these MISSes are happening? How do I find out if
> the
> > miss was due to eviction, expiration, some rogue header etc? Does
> Nginx
> > provide commands for this?
> >
> >
> >
> > - Quintin
> > _______________________________________________
> > nginx mailing list
> >
[hidden email]
> >
http://mailman.nginx.org/mailman/listinfo/nginx
> >

> <https://mailtrack.io/trace/link/122c3dbd333c388f47f5c2776af9ebc3fc75a
> e10?url=https%3A%2F%
2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-
> 3A__mailman.nginx.org_mailman_listinfo_nginx%26d%3DDwMFaQ%26c%3DcjytLX
> gP8ixuoHflwc-poQ%26r%3DwvXEDjvtDPcv7AlldT5UvDx32KXBEM6um_lS023SJrs%26m
> %3DF-qGMOyS74uE8JM-dOLmNH92bQ1xQ-7Rj1d6k-_WST4%26s%3DD3LnZhfobOtlEStCv
> CDrcwmHydEHaGRFC4gnWvRT5Uk%26e%3D&userId=74734&signature=56c7a7ad18b2c
> 057>
> >
> >
> > _______________________________________________
> > nginx mailing list
> >
[hidden email]
> >
http://mailman.nginx.org/mailman/listinfo/nginx
> >
> <
https://mailtrack.io/trace/link/92c2700d67bd6891ca1606e2df4e0f11c6d82
> 260?url=http%3A%2F%
2Fmailman.nginx.org%2Fmailman%2Flistinfo%2Fnginx&us
> erId=74734&signature=3763121afa828bb7>
> >
> _______________________________________________
> nginx mailing list
>
[hidden email]
>
http://mailman.nginx.org/mailman/listinfo/nginx

Posted at Nginx Forum:
https://forum.nginx.org/read.php?2,279764,279771#msg-279771


_______________________________________________
nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx

 


_______________________________________________
nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx
Reply | Threaded
Open this post in threaded view
|

Re: Debugging Nginx Cache Misses: Hitting high number of MISS despite high proxy valid

Peter Booth
In reply to this post by Quintin Par

Quintin,

I dont know anything about your context, but your setup looks over simplistic. Here are some things that I learned 
painfully over a few years of supporting a high traffic retail website

1. Is this a website that's on the internet, and thus exposed to random queries from bots and scrapers that you can’t control?

2. For your cache misses, how long best case, typical and worse case does your back-end take to build the pages?

3. You need to log everything that could feasibly affect the status of the site.  For example, here’s a log config urationfrom one gnarly site that I worked on:

    log_format main '$http_x_forwarded_for $http_true_client_ip $remote_addr - $remote_user [$time_local] $host "$request" '
                      '$status $body_bytes_sent $upstream_cache_status $cookie_jsessionid $http_akamai_country $cookie_e4x_country $cookie_e4x_currency "$http_referer" '
                      '"$http_user_agent" "$request_time”’;

4. the first problem is your cache key, and that it includes $request_uri which is the original uri
including all arguments. So you are already exposed to DOS requests that could be unintentional,
as anyone can bust your cache by adding an extra parameter.

 proxy_cache_key "$<a href="scheme://$host$request_uri$" class="">scheme://$host$request_uri$do_not_cache";

5. Not caching requests from logged in users is a very blunt tool. Is this a site where only administrative users are logged in?

Imagine a retail site that sells clothing. It’s possible that a dynamic page that lists all the red dresses is something 
a logged in user sees. Perhaps the page can be cached ? But if there is a version of the page that shows 30 entries and other 
that shows 60 then they need to disambiguated by the cache key.  Perhaps users can choose to see prices in Euro instead of USD?
Then this also belongs in the key. If I am an American vacationing in Pari s then perhaps the default behavior should be to show me
 Euro prices, based n the value of a cookie that the CDN sets. In the situation the customer may want to override this default behavior 
and insist he sees USD prices. You can see how complex this can get. 

7. The default behavior is to not cache responses that contain a set-cookie - imagine how cache pollution - sending someone another person’s personal data stored in a cookie could be much worse than a cache miss. But there are also settings where your backend is some legacy software that you dont control
and the correct behavior isn’t to not cache but instead to remove the set-cookie from the response and cache the response without it.

8 How you prime the cache , monitor the cache, and clear the cache are crucial . Perhaps you have a script that uses curl or wget to retrieve a series of pages from your site. If the script is written naively then each step might cause a new servlet session to be created on the backend producing a memory issue. 

9.  script is very useful to track the health of your cache:


10. The if directive in nginx has some issues  (see https://www.nginx.com/resources/wiki/start/topics/depth/ifisevil/ )
When I need to use complex configuration logic I use OpenResty. OpenResty is a bundle that 
combines the standard nginx with some additional lua modules. It’s still standard nginx -
 not forked or customized in any way.

11.

A very cut down version of a cache config for one page follows:

# Product arrays get cached
        location ~ /shop/ {
            rewrite "/(.*)/2];ord.*$" $1 ;
            proxy_no_cache $arg_mid $arg_siteID;
            proxy_cache_bypass $arg_mid $arg_siteID;
            proxy_cache_use_stale updating;
            default_type text/html;
            proxy_cache_valid 200 302 301 15m;
            proxy_ignore_headers Set-Cookie Cache-Control; 
            proxy_pass_header off;
            proxy_hide_header Set-Cookie;
            expires 900s;
            add_header  Last-Modified "";
            add_header  ETag "";            
            # Build cache key            
            set $e4x_currency $cookie_e4x_currency;
            set_if_empty $e4x_currency 'USD';
            set $num_items $cookie_EndecaNumberOfItems;
            set_if_empty $num_items 'LOW';           
            proxy_cache_key "$uri|$e4x_currency|$num_items";
            proxy_cache product_arrays;            
            # Add Canonical URL string
            set $folder_id $arg_FOLDER%3C%3Efolder_id;
            set $canonical_url "http://$http_host$uri";
            add_header Link "<$canonical_url>; rel=\"canonical\"";
            proxy_pass http://apache$request_uri;
        }


Tis snippet shows a key made of three parts. The real version has seven parts.

Good luck!

Peter


On 14 May 2018, at 12:06 AM, Quintin Par <[hidden email]> wrote:

Thanks all for the response. Michael, I am going to add those header ignores.

 

Still puzzled by the large number of MISSEs and I’ve no clue why they are happening. Leads appreciated.

 

 



- Quintin

On Sun, May 13, 2018 at 6:12 PM, c0nw0nk <[hidden email]> wrote:
You know you can DoS sites with Cache MISS via switching up URL params and
arguements.

Examples :

HIT :
index.php?var1=one&var2=two
MISS :
index.php?var2=two&var1=one

MISS :
index.php?random=1
index.php?random=2
index.php?random=3
etc etc

Inserting random arguements to URL's will cause cache misses and changing
the order of existing valid URL arguements will also cause misses.

Cherian Thomas Wrote:
-------------------------------------------------------
> Thanks for this Michael.
>
>
>
> This is so surprising. If someone decides to Dos and crawls the
> website
> with a rogue header, this will essentially bypass the cache and put a
> strain on the website. In fact, I was hit by a dos attack that’s when
> I
> started looking at logs and realized the large number of MISSes.
>
>
>
> Can someone please help?
>
>
> - Cherian
>
> On Sat, May 12, 2018 at 12:01 PM, Friscia, Michael
> <[hidden email]
> > wrote:
>
> > I'm not sure if this will help, but I ignore/hide a lot, this is in
> my
> > config
> >
> >
> > proxy_ignore_headers X-Accel-Expires Expires Cache-Control
> Set-Cookie;
> > proxy_hide_header X-Accel-Expires;
> > proxy_hide_header Pragma;
> > proxy_hide_header Server;
> > proxy_hide_header Request-Context;
> > proxy_hide_header X-Powered-By;
> > proxy_hide_header X-AspNet-Version;
> > proxy_hide_header X-AspNetMvc-Version;
> >
> >
> > I have not experienced the problem you mention, I just thought I
> would
> > offer my config.
> >
> >
> > ___________________________________________
> >
> > Michael Friscia
> >
> > Office of Communications
> >
> > Yale School of Medicine
> >
> > (203) 737-7932 – office
> >
> > (203) 931-5381 – mobile
> >
> > http://web.yale.edu
> >
> <https://mailtrack.io/trace/link/8357a0bdd8c40c2ff5b7d91c7797cbc7a8535
> ffb?url=http%3A%2F%2Fweb.yale.edu%2F&userId=74734&signature=d652edf1f4
> f21323>
> >
> >
> > ------------------------------
> > *From:* nginx <[hidden email]> on behalf of Quintin Par <
> > [hidden email]>
> > *Sent:* Saturday, May 12, 2018 1:32 PM
> > *To:* [hidden email]
> > *Subject:* Re: Debugging Nginx Cache Misses: Hitting high number of
> MISS
> > despite high proxy valid
> >
> >
> > That’s the tricky part. These MISSes are intermittent. Whenever I
> run curl
> > I get HITs but I end up seeing a lot of MISS in the logs.
> >
> >
> >
> > How do I log these MiSSes with the reason? I want to know what
> headers
> > ended up bypassing the cache.
> >
> >
> >
> > Here’s my caching config
> >
> >
> >
> >             proxy_pass http://127.0.0.1:8000
> >
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__127.0.0.1-3A8000&
> d=DwMFaQ&c=cjytLXgP8ixuoHflwc-poQ&r=wvXEDjvtDPcv7AlldT5UvDx32KXBEM6um_
> lS023SJrs&m=F-qGMOyS74uE8JM-dOLmNH92bQ1xQ-7Rj1d6k-_WST4&s=NHvlb1WColNw
> TWBF36P1whJdu5iWHK9_50IDHugaEdQ&e=>
> > ;
> >
> >                 proxy_set_header X-Real-IP  $remote_addr;
> >
> >                 proxy_set_header X-Forwarded-For
> > $proxy_add_x_forwarded_for;
> >
> >                 proxy_set_header X-Forwarded-Proto https;
> >
> >                 proxy_set_header X-Forwarded-Port 443;
> >
> >
> >
> >                 # If logged in, don't cache.
> >
> >                 if ($http_cookie ~*
> "comment_author_|wordpress_(?!test_cookie)|wp-postpass_"
> > ) {
> >
> >                   set $do_not_cache 1;
> >
> >                 }
> >
> >                 proxy_cache_key "$<a href="scheme://$host$request_uri$" class="">scheme://$host$request_uri$
> > do_not_cache";
> >
> >                 proxy_cache staticfilecache;
> >
> >                 add_header Cache-Control public;
> >
> >                 proxy_cache_valid       200 120d;
> >
> >                 proxy_hide_header "Set-Cookie";
> >
> >                 proxy_ignore_headers  "Set-Cookie";
> >
> >                 proxy_ignore_headers  "Cache-Control";
> >
> >                 proxy_hide_header "Cache-Control";
> >
> >                 proxy_pass_header X-Accel-Expires;
> >
> >
> >
> >                 proxy_set_header Accept-Encoding "";
> >
> >                 proxy_ignore_headers Expires;
> >
> >                 add_header X-Cache-Status $upstream_cache_status;
> >
> >                 proxy_cache_use_stale   timeout;
> >
> >                 proxy_cache_bypass $arg_nocache $do_not_cache;
> > - Quintin
> >
> >
> > On Sat, May 12, 2018 at 10:29 AM Lucas Rolff <[hidden email]>
> wrote:
> >
> > It can be as simple as doing a curl to your “origin” url (the one
> you
> > proxy_pass to) for the files you see that gets a lot of MISS’s – if
> there’s
> > odd headers such as cookies etc, then you’ll most likely experience
> a bad
> > cache if your nginx is configured to not ignore those headers.
> >
> >
> >
> > *From: *nginx <[hidden email]> on behalf of Quintin Par <
> > [hidden email]>
> > *Reply-To: *"[hidden email]" <[hidden email]>
> > *Date: *Saturday, 12 May 2018 at 18.26
> > *To: *"[hidden email]" <[hidden email]>
> > *Subject: *Debugging Nginx Cache Misses: Hitting high number of MISS
> > despite high proxy valid
> >
> >
> >
> > [image:
> >
> https://mailtrack.io/trace/mail/86a613eb1ce46a4e7fa6f9eb96989cddae6398
> 00.png?u=74734]
> >
> > My proxy cache path is set to a very high size
> >
> >
> >
> > proxy_cache_path  /var/lib/nginx/cache  levels=1:2
> >  keys_zone=staticfilecache:180m  max_size=700m;
> >
> > and the size used is only
> >
> >
> >
> > sudo du -sh *
> >
> > 14M cache
> >
> > 4.0K    proxy
> >
> > Proxy cache valid is set to
> >
> >
> >
> > proxy_cache_valid 200 120d;
> >
> > I track HIT and MISS via
> >
> >
> >
> > add_header X-Cache-Status $upstream_cache_status;
> >
> > Despite these settings I am seeing a lot of MISSes. And this is for
> pages
> > I intentionally ran a cache warmer an hour ago.
> >
> >
> >
> > How do I debug why these MISSes are happening? How do I find out if
> the
> > miss was due to eviction, expiration, some rogue header etc? Does
> Nginx
> > provide commands for this?
> >
> >
> >
> > - Quintin
> > _______________________________________________
> > nginx mailing list
> > [hidden email]
> > http://mailman.nginx.org/mailman/listinfo/nginx
> >
> <https://mailtrack.io/trace/link/122c3dbd333c388f47f5c2776af9ebc3fc75a
> e10?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-
> 3A__mailman.nginx.org_mailman_listinfo_nginx%26d%3DDwMFaQ%26c%3DcjytLX
> gP8ixuoHflwc-poQ%26r%3DwvXEDjvtDPcv7AlldT5UvDx32KXBEM6um_lS023SJrs%26m
> %3DF-qGMOyS74uE8JM-dOLmNH92bQ1xQ-7Rj1d6k-_WST4%26s%3DD3LnZhfobOtlEStCv
> CDrcwmHydEHaGRFC4gnWvRT5Uk%26e%3D&userId=74734&signature=56c7a7ad18b2c
> 057>
> >
> >
> > _______________________________________________
> > nginx mailing list
> > [hidden email]
> > http://mailman.nginx.org/mailman/listinfo/nginx
> >
> <https://mailtrack.io/trace/link/92c2700d67bd6891ca1606e2df4e0f11c6d82
> 260?url=http%3A%2F%2Fmailman.nginx.org%2Fmailman%2Flistinfo%2Fnginx&us
> erId=74734&signature=3763121afa828bb7>
> >
> _______________________________________________
> nginx mailing list
> [hidden email]
> http://mailman.nginx.org/mailman/listinfo/nginx

Posted at Nginx Forum: https://forum.nginx.org/read.php?2,279764,279771#msg-279771

_______________________________________________
nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx

_______________________________________________
nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx


_______________________________________________
nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx
Reply | Threaded
Open this post in threaded view
|

Re: Debugging Nginx Cache Misses: Hitting high number of MISS despite high proxy valid

Quintin Par

Thank you so much for this Peter. Very helpful.

 

For what it’s worth, I run a static wordpress website. So the configuration should not be very complicated.

 

The link that you provided also led me to https://github.com/perusio/wordpress-nginx

 

To answer your queries:

 

>1. Is this a website that's on the internet, and thus exposed to random queries from bots and scrapers that you can’t control?

Yes and a lot of scammy attacks typical to all wordpress websites. I’ve enabled connection limiting and request limiting of wordpress along with fail2ban on the request limiting rule.

 

> 2. For your cache misses, how long best case, typical and worse case does your back-end take to build the pages?

I run a warmer script and I expect all the pages to stay there 120 days. This is run every week and takes 1 hour.

 

4. Instead of $request_uri what’s the right variable that excludes all parameters? Is it $uri?

 

> 9.  script is very useful to track the health of your cache:

Thank you for this.

 

Based on your response my suspicion is that url params might be the culprit here. But I wish there was a way to diagnostically get to the root cause. Do you know of any param/variable I can log to access log for this?

 

- Quintin




On Mon, May 14, 2018 at 11:08 AM Peter Booth <[hidden email]> wrote:

Quintin,

I dont know anything about your context, but your setup looks over simplistic. Here are some things that I learned 
painfully over a few years of supporting a high traffic retail website

1. Is this a website that's on the internet, and thus exposed to random queries from bots and scrapers that you can’t control?

2. For your cache misses, how long best case, typical and worse case does your back-end take to build the pages?

3. You need to log everything that could feasibly affect the status of the site.  For example, here’s a log config urationfrom one gnarly site that I worked on:

    log_format main '$http_x_forwarded_for $http_true_client_ip $remote_addr - $remote_user [$time_local] $host "$request" '
                      '$status $body_bytes_sent $upstream_cache_status $cookie_jsessionid $http_akamai_country $cookie_e4x_country $cookie_e4x_currency "$http_referer" '
                      '"$http_user_agent" "$request_time”’;

4. the first problem is your cache key, and that it includes $request_uri which is the original uri
including all arguments. So you are already exposed to DOS requests that could be unintentional,
as anyone can bust your cache by adding an extra parameter.

 proxy_cache_key "$scheme://$host$request_uri$do_not_cache";

5. Not caching requests from logged in users is a very blunt tool. Is this a site where only administrative users are logged in?

Imagine a retail site that sells clothing. It’s possible that a dynamic page that lists all the red dresses is something 
a logged in user sees. Perhaps the page can be cached ? But if there is a version of the page that shows 30 entries and other 
that shows 60 then they need to disambiguated by the cache key.  Perhaps users can choose to see prices in Euro instead of USD?
Then this also belongs in the key. If I am an American vacationing in Pari s then perhaps the default behavior should be to show me
 Euro prices, based n the value of a cookie that the CDN sets. In the situation the customer may want to override this default behavior 
and insist he sees USD prices. You can see how complex this can get. 

7. The default behavior is to not cache responses that contain a set-cookie - imagine how cache pollution - sending someone another person’s personal data stored in a cookie could be much worse than a cache miss. But there are also settings where your backend is some legacy software that you dont control
and the correct behavior isn’t to not cache but instead to remove the set-cookie from the response and cache the response without it.

8 How you prime the cache , monitor the cache, and clear the cache are crucial . Perhaps you have a script that uses curl or wget to retrieve a series of pages from your site. If the script is written naively then each step might cause a new servlet session to be created on the backend producing a memory issue. 

9.  script is very useful to track the health of your cache:


10. The if directive in nginx has some issues  (see https://www.nginx.com/resources/wiki/start/topics/depth/ifisevil/ )
When I need to use complex configuration logic I use OpenResty. OpenResty is a bundle that 
combines the standard nginx with some additional lua modules. It’s still standard nginx -
 not forked or customized in any way.

11.

A very cut down version of a cache config for one page follows:

# Product arrays get cached
        location ~ /shop/ {
            rewrite "/(.*)/2];ord.*$" $1 ;
            proxy_no_cache $arg_mid $arg_siteID;
            proxy_cache_bypass $arg_mid $arg_siteID;
            proxy_cache_use_stale updating;
            default_type text/html;
            proxy_cache_valid 200 302 301 15m;
            proxy_ignore_headers Set-Cookie Cache-Control; 
            proxy_pass_header off;
            proxy_hide_header Set-Cookie;
            expires 900s;
            add_header  Last-Modified "";
            add_header  ETag "";            
            # Build cache key            
            set $e4x_currency $cookie_e4x_currency;
            set_if_empty $e4x_currency 'USD';
            set $num_items $cookie_EndecaNumberOfItems;
            set_if_empty $num_items 'LOW';           
            proxy_cache_key "$uri|$e4x_currency|$num_items";
            proxy_cache product_arrays;            
            # Add Canonical URL string
            set $folder_id $arg_FOLDER%3C%3Efolder_id;
            set $canonical_url "http://$http_host$uri";
            add_header Link "<$canonical_url>; rel=\"canonical\"";
            proxy_pass http://apache$request_uri;
        }


Tis snippet shows a key made of three parts. The real version has seven parts.

Good luck!

Peter


On 14 May 2018, at 12:06 AM, Quintin Par <[hidden email]> wrote:

Thanks all for the response. Michael, I am going to add those header ignores.

 

Still puzzled by the large number of MISSEs and I’ve no clue why they are happening. Leads appreciated.

 

 



- Quintin

On Sun, May 13, 2018 at 6:12 PM, c0nw0nk <[hidden email]> wrote:
You know you can DoS sites with Cache MISS via switching up URL params and
arguements.

Examples :

HIT :
index.php?var1=one&var2=two
MISS :
index.php?var2=two&var1=one

MISS :
index.php?random=1
index.php?random=2
index.php?random=3
etc etc

Inserting random arguements to URL's will cause cache misses and changing
the order of existing valid URL arguements will also cause misses.

Cherian Thomas Wrote:
-------------------------------------------------------
> Thanks for this Michael.
>
>
>
> This is so surprising. If someone decides to Dos and crawls the
> website
> with a rogue header, this will essentially bypass the cache and put a
> strain on the website. In fact, I was hit by a dos attack that’s when
> I
> started looking at logs and realized the large number of MISSes.
>
>
>
> Can someone please help?
>
>
>
- Quintin

>
> On Sat, May 12, 2018 at 12:01 PM, Friscia, Michael
> <[hidden email]
> > wrote:
>
> > I'm not sure if this will help, but I ignore/hide a lot, this is in
> my
> > config
> >
> >
> > proxy_ignore_headers X-Accel-Expires Expires Cache-Control
> Set-Cookie;
> > proxy_hide_header X-Accel-Expires;
> > proxy_hide_header Pragma;
> > proxy_hide_header Server;
> > proxy_hide_header Request-Context;
> > proxy_hide_header X-Powered-By;
> > proxy_hide_header X-AspNet-Version;
> > proxy_hide_header X-AspNetMvc-Version;
> >
> >
> > I have not experienced the problem you mention, I just thought I
> would
> > offer my config.
> >
> >
> > ___________________________________________
> >
> > Michael Friscia
> >
> > Office of Communications
> >
> > Yale School of Medicine
> >
> > (203) 737-7932 – office
> >
> > (203) 931-5381 – mobile
> >
> > http://web.yale.edu
> >
> <https://mailtrack.io/trace/link/8357a0bdd8c40c2ff5b7d91c7797cbc7a8535
> ffb?url=http%3A%2F%2Fweb.yale.edu%2F&userId=74734&signature=d652edf1f4
> f21323>
> >
> >
> > ------------------------------
> > *From:* nginx <[hidden email]> on behalf of Quintin Par <
> > [hidden email]>
> > *Sent:* Saturday, May 12, 2018 1:32 PM
> > *To:* [hidden email]
> > *Subject:* Re: Debugging Nginx Cache Misses: Hitting high number of
> MISS
> > despite high proxy valid
> >
> >
> > That’s the tricky part. These MISSes are intermittent. Whenever I
> run curl
> > I get HITs but I end up seeing a lot of MISS in the logs.
> >
> >
> >
> > How do I log these MiSSes with the reason? I want to know what
> headers
> > ended up bypassing the cache.
> >
> >
> >
> > Here’s my caching config
> >
> >
> >
> >             proxy_pass http://127.0.0.1:8000
> >
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__127.0.0.1-3A8000&
> d=DwMFaQ&c=cjytLXgP8ixuoHflwc-poQ&r=wvXEDjvtDPcv7AlldT5UvDx32KXBEM6um_
> lS023SJrs&m=F-qGMOyS74uE8JM-dOLmNH92bQ1xQ-7Rj1d6k-_WST4&s=NHvlb1WColNw
> TWBF36P1whJdu5iWHK9_50IDHugaEdQ&e=>
> > ;
> >
> >                 proxy_set_header X-Real-IP  $remote_addr;
> >
> >                 proxy_set_header X-Forwarded-For
> > $proxy_add_x_forwarded_for;
> >
> >                 proxy_set_header X-Forwarded-Proto https;
> >
> >                 proxy_set_header X-Forwarded-Port 443;
> >
> >
> >
> >                 # If logged in, don't cache.
> >
> >                 if ($http_cookie ~*
> "comment_author_|wordpress_(?!test_cookie)|wp-postpass_"
> > ) {
> >
> >                   set $do_not_cache 1;
> >
> >                 }
> >
> >                 proxy_cache_key "$scheme://$host$request_uri$
> > do_not_cache";
> >
> >                 proxy_cache staticfilecache;
> >
> >                 add_header Cache-Control public;
> >
> >                 proxy_cache_valid       200 120d;
> >
> >                 proxy_hide_header "Set-Cookie";
> >
> >                 proxy_ignore_headers  "Set-Cookie";
> >
> >                 proxy_ignore_headers  "Cache-Control";
> >
> >                 proxy_hide_header "Cache-Control";
> >
> >                 proxy_pass_header X-Accel-Expires;
> >
> >
> >
> >                 proxy_set_header Accept-Encoding "";
> >
> >                 proxy_ignore_headers Expires;
> >
> >                 add_header X-Cache-Status $upstream_cache_status;
> >
> >                 proxy_cache_use_stale   timeout;
> >
> >                 proxy_cache_bypass $arg_nocache $do_not_cache;
> > - Quintin
> >
> >
> > On Sat, May 12, 2018 at 10:29 AM Lucas Rolff <[hidden email]>
> wrote:
> >
> > It can be as simple as doing a curl to your “origin” url (the one
> you
> > proxy_pass to) for the files you see that gets a lot of MISS’s – if
> there’s
> > odd headers such as cookies etc, then you’ll most likely experience
> a bad
> > cache if your nginx is configured to not ignore those headers.
> >
> >
> >
> > *From: *nginx <[hidden email]> on behalf of Quintin Par <
> > [hidden email]>
> > *Reply-To: *"[hidden email]" <[hidden email]>
> > *Date: *Saturday, 12 May 2018 at 18.26
> > *To: *"[hidden email]" <[hidden email]>
> > *Subject: *Debugging Nginx Cache Misses: Hitting high number of MISS
> > despite high proxy valid
> >
> >
> >
> > [image:
> >
> https://mailtrack.io/trace/mail/86a613eb1ce46a4e7fa6f9eb96989cddae6398
> 00.png?u=74734]
> >
> > My proxy cache path is set to a very high size
> >
> >
> >
> > proxy_cache_path  /var/lib/nginx/cache  levels=1:2
> >  keys_zone=staticfilecache:180m  max_size=700m;
> >
> > and the size used is only
> >
> >
> >
> > sudo du -sh *
> >
> > 14M cache
> >
> > 4.0K    proxy
> >
> > Proxy cache valid is set to
> >
> >
> >
> > proxy_cache_valid 200 120d;
> >
> > I track HIT and MISS via
> >
> >
> >
> > add_header X-Cache-Status $upstream_cache_status;
> >
> > Despite these settings I am seeing a lot of MISSes. And this is for
> pages
> > I intentionally ran a cache warmer an hour ago.
> >
> >
> >
> > How do I debug why these MISSes are happening? How do I find out if
> the
> > miss was due to eviction, expiration, some rogue header etc? Does
> Nginx
> > provide commands for this?
> >
> >
> >
> > - Quintin
> > _______________________________________________
> > nginx mailing list
> > [hidden email]
> > http://mailman.nginx.org/mailman/listinfo/nginx
> >
> <https://mailtrack.io/trace/link/122c3dbd333c388f47f5c2776af9ebc3fc75a
> e10?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-
> 3A__mailman.nginx.org_mailman_listinfo_nginx%26d%3DDwMFaQ%26c%3DcjytLX
> gP8ixuoHflwc-poQ%26r%3DwvXEDjvtDPcv7AlldT5UvDx32KXBEM6um_lS023SJrs%26m
> %3DF-qGMOyS74uE8JM-dOLmNH92bQ1xQ-7Rj1d6k-_WST4%26s%3DD3LnZhfobOtlEStCv
> CDrcwmHydEHaGRFC4gnWvRT5Uk%26e%3D&userId=74734&signature=56c7a7ad18b2c
> 057>
> >
> >
> > _______________________________________________
> > nginx mailing list
> > [hidden email]
> > http://mailman.nginx.org/mailman/listinfo/nginx
> >
> <https://mailtrack.io/trace/link/92c2700d67bd6891ca1606e2df4e0f11c6d82
> 260?url=http%3A%2F%2Fmailman.nginx.org%2Fmailman%2Flistinfo%2Fnginx&us
> erId=74734&signature=3763121afa828bb7>
> >
> _______________________________________________
> nginx mailing list
> [hidden email]
> http://mailman.nginx.org/mailman/listinfo/nginx

Posted at Nginx Forum: https://forum.nginx.org/read.php?2,279764,279771#msg-279771

_______________________________________________
nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx

_______________________________________________
nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx

_______________________________________________
nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx

_______________________________________________
nginx mailing list
[hidden email]
http://mailman.nginx.org/mailman/listinfo/nginx