ice FreeBSD* Base Driver for the Intel(R) Ethernet 800 Series of Adapters
================================================================
May 18, 2021

Contents
========

- Overview
- Identifying Your Adapter
- The VF Driver
- Building and Installation
- Configuration and Tuning
- Known Issues/Troubleshooting


Important Notes
===============

Firmware Recovery Mode
----------------------
A device will enter Firmware Recovery mode if it detects a problem that
requires the firmware to be reprogrammed. When a device is in Firmware Recovery
mode it will not pass traffic or allow any configuration; you can only attempt
to recover the device's firmware. Refer to the Intel(R) Ethernet Adapters and
Devices User Guide for details on Firmware Recovery Mode and how to recover
from it.


Overview
========
This file describes the FreeBSD* driver for Intel(R) Ethernet. This driver has
been developed for use with all community-supported versions of FreeBSD.

For questions related to hardware requirements, refer to the documentation
supplied with your Intel Ethernet Adapter. All hardware requirements listed
apply to use with FreeBSD.

The associated Virtual Function (VF) driver for this driver is iavf.


Identifying Your Adapter
========================
The driver is compatible with devices based on the following:
  * Intel(R) Ethernet Controller E810-C
  * Intel(R) Ethernet Controller E810-XXV

For information on how to identify your adapter, and for the latest Intel
network drivers, refer to the Intel Support website:
http://www.intel.com/support


The VF Driver
=============
The VF driver is normally used in a virtualized environment where a host driver
manages SRIOV, and provides a VF device to the guest.

In the FreeBSD guest, the iavf driver would be loaded and will function using
the VF device assigned to it.

The VF driver provides most of the same functionality as the core driver, but
is actually a subordinate to the host. Access to many controls is accomplished
by a request to the host via what is called the "Admin queue." These are
startup and initialization events, however; once in operation, the device is
self-contained and should achieve near native performance.

Some notable limitations of the VF environment:
  * The PF can configure the VF to allow promiscuous mode, when using iovctl.
  * Media info is not available from the PF, so it will always appear as auto.


Adaptive Virtual Function
-------------------------
Adaptive Virtual Function (AVF) allows the virtual function driver, or VF, to
adapt to changing feature sets of the physical function driver (PF) with which
it is associated. This allows system administrators to update a PF without
having to update all the VFs associated with it. All AVFs have a single common
device ID and branding string.

AVFs have a minimum set of features known as "base mode," but may provide
additional features depending on what features are available in the PF with
which the AVF is associated. The following are base mode features:

- 4 Queue Pairs (QP) and associated Configuration Status Registers (CSRs)
  for Tx/Rx
- iavf descriptors and ring format
- Descriptor write-back completion
- 1 control queue, with iavf descriptors, CSRs and ring format
- 5 MSI-X interrupt vectors and corresponding iavf CSRs
- 1 Interrupt Throttle Rate (ITR) index
- 1 Virtual Station Interface (VSI) per VF
- 1 Traffic Class (TC), TC0
- Receive Side Scaling (RSS) with 64 entry indirection table and key,
  configured through the PF
- 1 unicast MAC address reserved per VF
- 16 MAC address filters for each VF
- Stateless offloads - non-tunneled checksums
- AVF device ID
- HW mailbox is used for VF to PF communications (including on Windows)


Building and Installation
=========================
NOTE: This driver package is to be used only as a standalone archive and the
user should not attempt to incorporate it into the kernel source tree.

In the instructions below, x.x.x is the driver version as indicated in the name
of the driver tar file.

1. Move the base driver tar file to the directory of your choice. For
   example, use /home/username/ice or /usr/local/src/ice.

2. Untar/unzip the archive:

   # tar xzf ice-x.x.x.tar.gz

This will create the ice-x.x.x directory.

3. To install man page:

   # cd ice-x.x.x
   # gzip -c ice.4 > /usr/share/man/man4/ice.4.gz

4. To load the driver onto a running system:

   # cd ice-x.x.x/src
   # make
   # kldload ./if_ice.ko

   NOTE: Running the make command will not install the Dynamic Device
   Personalization (DDP) package and could cause the driver to fail to load.
   See step #7 below for more information.

5. To assign an IP address to the interface, enter the following,
   where X is the interface number for the device:

   # ifconfig iceX <IP_address>

6. Verify that the interface works. Enter the following, where <IP_address>
   is the IP address for another machine on the same subnet as the interface
   that is being tested:

   # ping <IP_address>

7. If you want the driver to load automatically when the system is booted:

   # cd ice-x.x.x/src
   # make
   # make install

NOTE: It's important to do make install so that the driver loads the DDP
package automatically.

Edit /boot/loader.conf, and add the following line:
   if_ice_load="YES"

Edit /etc/rc.conf, and create the appropriate ifconfig_iceX entry:

   ifconfig_iceX="<ifconfig_settings>"

Example usage:
   ifconfig_ice0="inet 192.168.10.1 netmask 255.255.255.0"

    NOTE: For assistance, see the ifconfig man page.


Configuration and Tuning
========================

Important System Configuration Changes
--------------------------------------
- Change the file /etc/sysctl.conf, and add the line:

  hw.intr_storm_threshold: 0 (the default is 1000)

- Best throughput results are seen with a large MTU; use 9706 if possible.
  The default number of descriptors per ring is 1024. Increasing this may
  improve performance, depending on your use case.
- If you have a choice, run on a 64-bit OS rather than a 32-bit OS.


Configuring for iflib
---------------------
Iflib is a common framework for network interface drivers for FreeBSD that uses
a shared set of sysctl names. The iflib driver works best in FreeBSD 11.3 and
later.

See the iflib man page for more information.


Dynamic Device Personalization
------------------------------
Dynamic Device Personalization (DDP) allows you to change the packet processing
pipeline of a device by applying a profile package to the device at runtime.
Profiles can be used to, for example, add support for new protocols, change
existing protocols, or change default settings. DDP profiles can also be rolled
back without rebooting the system.

The ice driver automatically installs the default DDP package file during
driver installation. NOTE: It's important to do 'make install' during initial
ice driver installation so that the driver loads the DDP package automatically.

The DDP package loads during device initialization. The driver looks for the
ice_ddp module and checks that it contains a valid DDP package file.

If the driver is unable to load the DDP package, the device will enter Safe
Mode. Safe Mode disables advanced and performance features and supports only
basic traffic and minimal functionality, such as updating the NVM or
downloading a new driver or DDP package. Safe Mode only applies to the affected
physical function and does not impact any other PFs. See the "Intel(R) Ethernet
Adapters and Devices User Guide" for more details on DDP and Safe Mode.

NOTES:
- If you encounter issues with the DDP package file, you may need to download
an updated driver or ice_ddp module. See the log messages for more information.

- You cannot update the DDP package if any PF drivers are already loaded. To
overwrite a package, unload all PFs and then reload the driver with the new
package.

- You can only use one DDP package per driver, even if you have more than one
device installed that uses the driver.

- Only the first loaded PF per device can download a package for that device.


FW-LLDP (Firmware Link Layer Discovery Protocol)
------------------------------------------------
Use sysctl to change FW-LLDP settings. The FW-LLDP setting is per port and
persists across boots.

To enable LLDP:

# sysctl dev.ice.<interface #>.fw_lldp_agent=1

To disable LLDP:

# sysctl dev.ice.<interface #>.fw_lldp_agent=0

To check the current LLDP setting:

# sysctl dev.ice.<interface #>.fw_lldp_agent
     or
# sysctl -a|grep lldp

NOTE: You must enable the UEFI HII "LLDP Agent" attribute for this setting to
take effect. If "LLDP AGENT" is set to disabled, you cannot enable it from the
OS.


Jumbo Frames
------------
Jumbo Frames support is enabled by changing the Maximum Transmission Unit (MTU)
to a value larger than the default value of 1500.

Use the ifconfig command to increase the MTU size. For example, enter the
following where X is the interface number:

# ifconfig iceX mtu 9000

To confirm an interface's MTU value, use the ifconfig command.

To confirm the MTU used between two specific devices, use:

# route get <destination_IP_address>

NOTE: The maximum MTU setting for jumbo frames is 9706. This corresponds to the
maximum jumbo frame size of 9728 bytes.

NOTE: This driver will attempt to use multiple page sized buffers to receive
each jumbo packet. This should help to avoid buffer starvation issues when
allocating receive packets.

NOTE: Packet loss may have a greater impact on throughput when you use jumbo
frames. If you observe a drop in performance after enabling jumbo frames,
enabling flow control may mitigate the issue.


VLANS
-----
To create a new VLAN interface:

# ifconfig <vlan_name> create

To associate the VLAN interface with a physical interface and assign a VLAN ID,
IP address, and netmask:

# ifconfig <vlan_name> <ip_address> netmask <subnet_mask> vlan <vlan_id>
vlandev <physical_interface>

Example:

# ifconfig vlan10 10.0.0.1 netmask 255.255.255.0 vlan 10 vlandev ice0

In this example, all packets will be marked on egress with 802.1Q VLAN tags,
specifying a VLAN ID of 10.

To remove a VLAN interface:

# ifconfig <vlan_name> destroy


Checksum Offload
----------------
Checksum offloading supports both TCP and UDP packets and is supported for both
transmit and receive.

Checksum offloading can be enabled or disabled using ifconfig.

To enable checksum offloading:

# ifconfig iceX rxcsum rxcsum6
# ifconfig iceX txcsum txcsum6

To disable checksum offloading:

# ifconfig iceX -rxcsum -rxcsum6
# ifconfig iceX -txcsum -txcsum6

To confirm the current setting:

# ifconfig iceX

Look for the presence or absence of the following line:
  options=3 <RXCSUM,TXCSUM,RXCSUM6,TXCSUM6>

See the ifconfig man page for further information.


TSO
---
TSO (TCP Segmentation Offload) supports both IPv4 and IPv6. TSO can be disabled
and enabled using the ifconfig utility or sysctl.

NOTE: TSO requires Tx checksum, if Tx checksum is disabled, TSO will also be
disabled.

To enable/disable TSO in the stack:

# sysctl net.inet.tcp.tso=0 (or 1 to enable it)

Doing this disables/enables TSO in the stack and affects all installed adapters.

To disable BOTH TSO IPv4 and IPv6, where X is the number of the interface in
use:

# ifconfig iceX -tso

To enable BOTH TSO IPv4 and IPv6:

# ifconfig iceX tso

You can also enable/disable IPv4 TSO or IPv6 TSO individually. Simply replace
tso|-tso in the above command with tso4 or tso6. For example, to disable
TSO IPv4:

# ifconfig iceX -tso4

To disable TSO IPv6:

# ifconfig iceX -tso6


LRO
---
LRO (Large Receive Offload) may provide Rx performance improvement. However, it
is incompatible with packet-forwarding workloads. You should carefully evaluate
the environment and enable LRO when possible.

To enable:

# ifconfig iceX lro

It can be disabled by using:

# ifconfig iceX -lro


Rx and Tx Descriptor Rings
--------------------------
Allows you to set the Rx and Tx descriptor rings independently. The tunables
are:
  hw.ice.rx_ring_size
  hw.ice.tx_ring_size

The valid range is 32-4096 in increments of 32. Use kenv to configure the
descriptor rings. Changes will take effect on the next driver reload.
For example:

# kenv hw.ice.rx_ring_size=1024
# kenv hw.ice.rx_ring_size=1280

You can verify the descriptor ring size by using the following sysctls:

# sysctl dev.ice.<interface_num>.rx_ring_size
# sysctl dev.ice.<interface_num>.tx_ring_size

If you are using iflib, use the following sysctls instead:

# sysctl dev.ice.<interface_num>.iflib.override_nrxds
# sysctl dev.ice.<interface_num>.iflib.override_ntxds


Flow Control
------------
Ethernet Flow Control (IEEE 802.3x) can be configured with sysctl to enable
receiving and transmitting pause frames for ice. When transmit is enabled,
pause frames are generated when the receive packet buffer crosses a predefined
threshold. When receive is enabled, the transmit unit will halt for the time
delay specified when a pause frame is received.

NOTE: You must have a flow control capable link partner.

Flow Control is disabled by default.

Use sysctl to change the flow control settings for a single interface without
reloading the driver.

The available values for flow control are:
  0 = Disable flow control
  1 = Enable Rx pause
  2 = Enable Tx pause
  3 = Enable Rx and Tx pause

Examples:
- To enable a flow control setting with sysctl:
    # sysctl dev.ice.<interface_num>.fc=3

- To disable flow control using sysctl:
    # sysctl dev.ice.<interface_num>.fc=0

NOTE:
- The ice driver requires flow control on both the port and link partner. If
flow control is disabled on one of the sides, the port may appear to hang on
heavy traffic.

NOTE: The VF driver does not have access to flow control. It must be managed
from the host side.


Forward Error Correction (FEC)
------------------------------
Allows you to set the Forward Error Correction (FEC) mode. FEC improves link
stability, but increases latency. Many high quality optics, direct attach
cables, and backplane channels provide a stable link without FEC.

NOTE: For devices to benefit from this feature, link partners must have FEC
enabled.

Use sysctl to configure FEC.
To show the current FEC settings that are negotiated on the link:

# sysctl dev.ice.#.negotiated_fec

To view or set the FEC setting that was requested on the link:

# sysctl dev.ice.#.requested_fec

To see the valid modes for the link:

# sysctl -d dev.ice.#.requested_fec


Firmware Logs
-------------
The ice driver allows you to generate firmware logs for supported categories of
events, to help debug issues with Intel Customer Support. Firmware logs are
enabled by default.

Once the driver is loaded, it will create the fw_log sysctl node under the
debug section of the driver's sysctl list. The driver groups these events into
categories, called "modules." Supported modules include:
* general - General (Bit 0)
* ctrl - Control (Bit 1)
* link - Link Management (Bit 2)
* link_topo - Link Topology Detection (Bit 3)
* dnl - Link Control Technology (Bit 4)
* i2c - I2C (Bit 5)
* sdp - SDP (Bit 6)
* mdio - MDIO (Bit 7)
* adminq - Admin Queue (Bit 8)
* hdma - Host DMA (Bit 9)
* lldp - LLDP (Bit 10)
* dcbx - DCBx (Bit 11)
* dcb - DCB (Bit 12)
* xlr - XLR (function-level resets; Bit 13)
* nvm - NVM (Bit 14)
* auth - Authentication (Bit 15)
* vpd - Vital Product Data (Bit 16)
* iosf - Intel On-Chip System Fabric (Bit 17)
* parser - Parser (Bit 18)
* sw - Switch (Bit 19)
* scheduler - Scheduler (Bit 20)
* txq - TX Queue Management (Bit 21)
* acl - ACL (Access Control List; Bit 22)
* post - Post (Bit 23)
* watchdog - Watchdog (Bit 24)
* task_dispatch - Task Dispatcher (Bit 25)
* mng - Manageability (Bit 26)
* synce - SyncE (Bit 27)
* health - Health (Bit 28)
* tsdrv - Time Sync (Bit 29)
* pfreg - PF Registration (Bit 30)
* mdlver - Module Version (Bit 31)

You can change the verbosity level of the firmware logs. You can set only one
log level per module, and each level includes the verbosity levels lower than
it. For instance, setting the level to "normal" will also log warning and error
messages. Available verbosity levels are:
 0 = none
 1 = error
 2 = warning
 3 = normal
 4 = verbose

To set the desired verbosity level for a module, use the following sysctl
command and then register it:

# sysctl dev.ice.<interface_num>.debug.fw_log.severity.<module>=<level>

For example:
# sysctl dev.ice.0.debug.fw_log.severity.link=1
# sysctl dev.ice.0.debug.fw_log.severity.link_topo=2
# sysctl dev.ice.0.debug.fw_log.register=1

To log firmware messages before the driver initializes, use the kenv command to
set the tunable. The on_load setting tells the device to register the variable
as soon as possible during driver load. For example:

# kenv dev.ice.0.debug.fw_log.severity.link=1
# kenv dev.ice.0.debug.fw_log.severity.link_topo=2
# kenv dev.ice.0.debug.fw_log.on_load=1

To view the firmware logs and redirect them to a file, use the following
command:

# dmesg > log_output

NOTE: Logging a large number of modules or too high of a verbosity level will
add extraneous messages to dmesg and could hinder debug efforts.


Speed and Duplex Configuration
------------------------------
You cannot set speed, duplex, or autonegotiation settings.

To have your device advertise supported speeds, use the following:

# sysctl dev.ice.<interface_num>.advertise_speed=<mask>

Supported speeds will vary by device. Depending on the speeds your device
supports, available speed masks could include:
0x0 - Auto
0x2 - 100 Mbps
0x4 - 1 Gbps
0x8 - 2.5 Gbps
0x10 - 5 Gbps
0x20 - 10 Gbps
0x80 - 25 Gbps
0x100 - 40 Gbps
0x200 - 50 Gbps
0x400 - 100 Gbps


Known Issues/Troubleshooting
============================

Driver Buffer Overflow Fix
--------------------------
The fix to resolve CVE-2016-8105, referenced in Intel SA-00069
<https://security-center.intel.com/advisory.aspx?intelid=INTEL-SA-00069&language
id=en-fr>, is included in this and future versions of the driver.


Network Memory Buffer Allocation
--------------------------------
FreeBSD may have a low number of network memory buffers (mbufs) by default. If
your mbuf value is too low, it may cause the driver to fail to initialize
and/or cause the system to become unresponsive. You can check to see if the
system is mbuf-starved by running 'netstat -m'. Increase the number of mbufs by
editing the lines below in /etc/sysctl.conf:

  kern.ipc.nmbclusters
  kern.ipc.nmbjumbop
  kern.ipc.nmbjumbo9
  kern.ipc.nmbjumbo16
  kern.ipc.nmbufs

The amount of memory that you allocate is system specific, and may require some
trial and error. Also, increasing the following in /etc/sysctl.conf could help
increase network performance:

  kern.ipc.maxsockbuf
  net.inet.tcp.sendspace
  net.inet.tcp.recvspace
  net.inet.udp.maxdgram
  net.inet.udp.recvspace


UDP Stress Test Dropped Packet Issue
------------------------------------
Under small packet UDP stress with the ice driver, the system may drop UDP
packets due to socket buffers being full. Setting the driver Intel Ethernet
Flow Control variables to the minimum may resolve the issue. You may also try
increasing the kernel's default buffer sizes by changing the values in

  /proc/sys/net/core/rmem_default and rmem_max


Disable LRO when routing/bridging
---------------------------------
LRO must be turned off when forwarding traffic.


Lower than expected performance
-------------------------------
Some PCIe x8 slots are actually configured as x4 slots. These slots have
insufficient bandwidth for full line rate with dual port and quad port devices.
In addition, if you put a PCIe v4.0 or v3.0-capable adapter into a PCIe v2.x
slot, you cannot get full bandwidth. The driver detects this situation and
writes one of the following messages in the system log:

"PCI-Express bandwidth available for this card is not sufficient for optimal
performance. For optimal performance a x8 PCI-Express slot is required."
  or
"PCI-Express bandwidth available for this device may be insufficient for
optimal performance. Please move the device to a different PCI-e link with more
lanes and/or higher transfer rate."

If this error occurs, moving your adapter to a true PCIe v3.0 x8 slot will
resolve the issue. For best performance, the device needs to be installed in a
PCIe v4.0 x8 or v3.0 x16 slot.


Throughput lower than expected
------------------------------
In FreeBSD 11.3, you may observe lower than expected throughput. This is due to
an underlying OS limitation in FreeBSD 11.3. Using FreeBSD 12.0 or newer should
resolve the issue.

If your Rx throughput is lower than expected in FreeBSD 11.3 or 12.1, you can
also adjust the iflib sysctl variable 'rx_budget.' We have seen performance
benefits by increasing that value to at least 85. For example:

# sysctl dev.ice.0.iflib.rx_budget=85


Fiber optics and auto-negotiation
---------------------------------
Modules based on 100GBASE-SR4, active optical cable (AOC), and active copper
cable (ACC) do not support auto-negotiation per the IEEE specification. To
obtain link with these modules, you must turn off auto-negotiation on the link
partner's switch ports.


Support
=======
For general information, go to the Intel support website at:
http://www.intel.com/support/

If an issue is identified with the released source code on a supported kernel
with a supported adapter, email the specific information related to the issue
to freebsd@intel.com


Copyright(c) 2019 - 2021 Intel Corporation.


Trademarks
==========
Intel is a trademark or registered trademark of Intel Corporation or its
subsidiaries in the United States and/or other countries.

* Other names and brands may be claimed as the property of others.


