Scope
Customers who run any version of FAS in the Cloud and experience connection issues.
Symptoms/Context
When experiencing connection issues, it is of great importance to gather as much information as possible while the issue is still occurring. The reason behind this is that the nature and root causes of issues experienced as connection issues can vary considerably, and pinpointing the issue once it no longer occurs might not be possible.
Resolution
Identify the step of the query flow during which the connection issue occurs
Considering the query flow described in the article Query API. Full employment of fail-safe features, it is crucial to delimit the step during which the issue occurs. The root cause can be on Fredhopper side, on customer's side, or on the route in between. While the information for some of the steps is present on Fredhopper side, information for other steps can be gathered only on customer side. The more information can be supplied to Support, the quicker the issue can be identified.
Let's look at the steps in the query flow, the possible issues during these steps and who can gather the necessary information.
| # | Step | Questions to identify possible issues | Who can answer the questions? |
|---|---|---|---|
| 1 | The front-end composes the query and prepares it to be sent to the Query API end-point. | Does the front-end compose the correct query? Does the front-end send the query? Where does the front-end send the query to? |
The front-end is out of scope for Support. This information needs to be gathered on customer side. |
| 2 | The Query API end-point DNS name gets resolved. The DNS authority answers with the external IP of one of the load-balancers. | Which IP addresses are resolved? Does the issue occur for all of them? |
It is possible to find only on customer-side if it occurs only for specific load-balancers. |
| 3 | The front-end sends the query to the IP address using the Host header to indicate which service instance to use and supplying the account credentials. | How long does it take to connect? Do you observe packet loss? |
Times to connect and packet loss should be checked on customer side. |
| 4 |
The load balancer receives the query, authenticates the requests (assume: successfully). It uses the Host header to decide to which service instance this request should be forwarded to. Once decided, the query is passed to an active query server with the least connections open. |
Did the query reach a query server? | This is checked by Support on Fredhopper side. |
| 5 | The query server processes the query, logs it and sends it back to the load balancer. | Was the query answered successfully? How long did it take for the answer to be generated? |
This is checked by Support on Fredhopper side. |
| 6 | The load balancer logs the query and sends it back to the front-end. | How long did it take for the answer to reach the front-end? | This can be checked on both customer and Support side. |
Tools
This section gathers a list of tools for both Linux and Windows environments to collect information in order to diagnose connection issues.
|
Looking at the routes both from and to your front-end provides more insight. Therefore, please share with us your front-end's public IP(s) along with the output generated by the tools below. This way we can perform the same analysis to the front-end. Please share with us the output of the tools below saved in plain text files. This saves time spent formatting. |
External monitor
Consider setting up an external monitor such as Statuscake or Pingdom so connectivity and ISP related issues can be easily traced from different locations in the world.
ping
A quick way to determine whether packet loss takes place is to use ping. For both Linux and Windows:
ping <IP_ADDRESS>
traceroute
Please employ the traceroute implementations which use TCP packets. Time outs might occur due to ICMP traffic being blocked or given low priority.
On Linux environments we recommend tcptraceroute:
tcptraceroute -n <IP_ADDRESS>
On Windows environments we recommend the tcptrace utility.
tracetcp <IP_ADDRESS>
mtr
On Linux environments we recommend mtr:
mtr --report-cycles 100 --report --no-dns --tcp --port 80 <IP_ADDRESS>
On Windows environments we recommend the winmtr application, available for download here.
Linux script to track connection times
The script below uses cURL to query the Fredhopper service every 2 seconds and logs the following information:
- time stamp
- ip address
- time namelookup
- time connect
- time appconnect
- time pretransfer
- time redirect
- time starttransfer
- time total
- HTTP code
- size download
It is useful in troubleshooting scenarios in which it takes a long time to connect. When the time needed to connect is higher than 2 seconds a traceroute is performed. When encountering such a scenario this script should run the entire time until the issue is resolved and the output should be shared regularly with Support.
Please replace instance, service, region, username and password before running the script.
Prerequisite to run the script below is that the following are available on your system: bash, curl, bc and tcptraceroute. Also be sure to capture the script output and errors in a file that can be shared with Fredhopper Support.
#!/bin/bash
while true
do
dig query.published.<instance>.<service>.<region>.fredhopperservices.com +short | while read -r ip_address
do
timeStamp=$( date '+%Y-%m-%d %H:%M:%S') ;
nextLine=$( curl -w '%{time_namelookup} %{time_connect} %{time_appconnect} %{time_pretransfer} %{time_redirect} %{time_starttransfer} %{time_total} %{http_code} %{size_download}\n' -o /dev/null -S -s -k -u '<username>:<password>' -H 'Host: query.published.<instance>.<service>.<region>.fredhopperservices.com' https://${ip_address}/fredhopper/query )
read -r f_t_resolve f_t_connect notneeded <<< "${nextLine}"
echo "${timeStamp} ${ip_address} ${nextLine}"
if [ 1 -eq $( echo ${f_t_connect}'>'2 | bc -l ) ] ; then
traceroute -T -n -p 443 ${ip_address}
fi
sleep 2
done
done
Additional Questions to consider
- Do the slow responses occur for a specific type of requests (e.g. only search or only navigation)?
- Does your front-end query any other external services? If so, do you experience any issues connecting to other external services?
- How much bandwidth is available?
- Do you have any bandwidth throttling or quality of service (QoS) functionalities enabled on your front-end's servers, firewalls or other network equipment?
- Do you have multiple internet lines available, and if so, does the problem occur on each of the lines?
Addendum
Please consider also the following articles from external resources, which provide insight on troubleshooting such issues:
Comments
0 comments
Please sign in to leave a comment.