-
Notifications
You must be signed in to change notification settings - Fork 207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LTE & NR Coexistance #422
base: improve_5g_ims
Are you sure you want to change the base?
LTE & NR Coexistance #422
Conversation
These shouldn't be hardcoded, but rather use the IP addresses declared in .env
Rework the MySQL container: - Change the mysql_init.sh to be bind mounted: This allows dynamic changes to the script without having to recompile the container. - Change the way the container data is first set up: The current process depends on the container being started the first time with an empty docker volume, since the docker volume will then implement a "copy on first use" method to populate itself with data. This data is in the docker image from the initial compilation, as installing mysql-server also initializes the data directory. If for some reason, the docker volume is not "empty", then the generic MySQL data directory from the image is never copied into it, resulting in an invalid data directory. This adds checks to ensure that the data directory is present, and if not, then initialize one before trying to use it. - Change the permission setting: The current usermod -d call doesn't guarantee that the data directory is owned by mysql:mysql, and therefore leaves open the possibility for the data directory to fail to be used by the MySQL daemon. This occurs when migrating the volume from one machine to another. Adding the chown -R call ensures that the owner is set properly. - Don't use mysql restart Some edge cases exist where restart doesn't actually stop any running mysql instances. Use stop and start instead, and also add a kill call to ensure that all running mysql instances are fully killed before we make changes.
- Don't use pkill or kill with database processes. These could lead to database corruption - Make sure to gracefully shutdown the database daemon/services when the script recieves a terminate or interrupt command - Ensure that mysqld_safe doesn't take over the PID so that interrupt or terminate signals reach the script - Change the image to use ENTRYPOINT to ensure that signals reach it - Add health check: Docker will ping the mysql database service every 30sec to ensure it's still up
Currently the P-CSCF can handle LTE-LTE and NR-NR communications. Cross-technology communications are still a work in progress.
Using the PANI header, we can determine what interface this UE should register via. TODO: - Implement detection for additional network types (ex. for IWLAN) - Add some sort of database store for network access type, to be used in the mt config, as P-CSCF cannot query the network type from the device before starting a call
MT is currently broken due to the lack of a PANI header. This needs to be reconfigured to read out of a database that has updated values that are written during UE registeration.
Precursor to moving the N5 code to it's own configs
This allows us to have entirely different routes to select from depending on the technology, versus having to do switching in the middle of the routes.
Required after the route config file split
Thank you very much for your contribution. I will try to review this MR as soon as possible. One thing which prevents co-existence from working is I believe there should be two instances of SMF and UPF (one handling 4G and another 5G), referring to deploy-all.yaml. |
@@ -45,7 +45,7 @@ elif [[ "$COMPONENT_NAME" =~ ^(pcscf-[[:digit:]]+$) ]]; then | |||
/mnt/pcscf/pcscf_init.sh && \ | |||
mkdir -p /var/run/kamailio_pcscf && \ | |||
rm -f /kamailio_pcscf.pid && \ | |||
kamailio -f /etc/kamailio_pcscf/kamailio_pcscf.cfg -P /kamailio_pcscf.pid -DD -E -e | |||
kamailio -M 16 -m 128 -f /etc/kamailio_pcscf/kamailio_pcscf.cfg -P /kamailio_pcscf.pid -DD -E -e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you let me know why this is needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not entirely sure what the root issue is, but I was seeing weird/random memory freeing issues when including all 5 route configs (register, mo_N5, mo_Rx, mt_N5, mt_Rx). By excluding any one of those configs, the errors would disappear, and my researching it seemed to indicate that it was due to the process running out of memory. Bumping up the default memory to 16M/128M like this resolved the issues.
@@ -75,11 +75,11 @@ server_header="Server: TelcoSuite Proxy-CSCF" | |||
log_facility=LOG_LOCAL0 | |||
|
|||
fork=yes | |||
children=4 | |||
children=16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you let me know why this is increased?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was seeing some weird errors with TCP and HTTP connections after splitting the configs, bumping this (and the TCP processes) up seemed to resolve it, tho I want to see if this was related to the memory issues noted above or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from my experience upon increasing this value increases the memory requirement thats the reason I kept it at 4 (when compared to sample configuration)
|
||
#!ifndef TCP_PROCESSES | ||
# Number of TCP Processes | ||
#!define TCP_PROCESSES 16 | ||
#!define TCP_PROCESSES 32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see above
@@ -96,7 +96,7 @@ enable_tls=yes | |||
#!endif | |||
#!ifndef TCP_PROCESSES | |||
# Number of TCP Processes | |||
#!define TCP_PROCESSES 3 | |||
#!define TCP_PROCESSES 12 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see above
pcscf/kamailio_pcscf.cfg
Outdated
@@ -226,6 +226,7 @@ loadmodule "websocket.so" | |||
loadmodule "cdp" | |||
loadmodule "cdp_avp" | |||
loadmodule "ims_qos" | |||
loadmodule "ims_diameter_server" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove this if its not used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, sorry I thought I removed this. This was in an attempt to read the MSISDN from the Diameter Sh interface, which would have been my preferred approach vs reading from S-CSCF. I would still like to get this working, but I can remove it for now.
#!endif | ||
|
||
# Tables to store users subscription tech (Rx or N5) | ||
modparam("htable", "htable", "sub_tech=>size=4096;autoexpire=UE_SUBSCRIPTION_EXPIRES;") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe there is no UE_SUBSCRIPTION_EXPIRES defined in this branch (its in the other branch improve_xxx)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be in this branch, since I'm based on the improve_5g_ims branch which has it: c3c907f
|
||
# Try retrieving the IMPI from S-CSCF using the MSISDN | ||
$var(msisdn_sub_id) = $ru; | ||
route(GET_IMPI_FROM_SCSCF); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not much of a fan of having to contact S-CSCF to fetch IMPI. Rather than this I would suggest parsing all the IMPUs present in P-Associated-URI during the time of registration and associate all of them with either NR or LTE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It wasn't my first approach either, I didn't want to have external connections, but it's the most reliable method I've found. At least in my research, it didn't seem that P-Associated-URI is always present, but perhaps I'm mistaken there? Maybe we could add a check against P-Associated-URI first and then fall back to S-CSCF if P-Associated-URI isn't present?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't want to have external connections, but it's the most reliable method I've found. At least in my research, it didn't seem that P-Associated-URI is always present, but perhaps I'm mistaken there?
That SIP header is most of the time present (basically first value in this header sets the calling number UE needs to use)
. Say if its not present then we can always use the IMPI (IMSI based used for registration)
pcscf/route/register.cfg
Outdated
$var(imsi) = $tU; | ||
#xlog("L_INFO", "IMSI: $var(imsi)\n"); | ||
|
||
#!ifdef WITH_RX |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove commented out code for features which may be added in future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, missed this (attempt at Diameter Sh)
pcscf/route/register.cfg
Outdated
#!endif | ||
|
||
#!ifdef WITH_N5 | ||
if (!$var(msisdn)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
didnt understand why this retrieving of MSISDN is needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was an attempt to fetch the MSISDN to use for checking the subscription in MO and MT routes, since the devices do not use the IMPI then, but this isn't functional on Open5GS due to missing APIs.
pcscf/route/register.cfg
Outdated
} | ||
|
||
#!ifdef WITH_RX | ||
#event_route[ims_diameter_server:sh-User-Data-Answer] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove this as well
I've not found any issues with running both LTE and NR with a single SMF/UPF pair, but maybe I'm missing something. Everything I've tested on LTE seems to work fine but if you know what I should be on the lookout for, I can test. Additionally, see above replied to comments |
Thank you for addressing the comments. Is it okay if I delay merging this MR once calling in NR is verified?
Thats because SMF is deployed in 4G mode when deployed using deploy-all.yaml |
Sure I'm fine with that, but I personally can't do that given my continued issues with the QoS bearer issues (still not sure why it's not working even on a Threadripper 7960X with 256GB RAM). I'll upload a new chain today with some of the cleanup you requested around the commented out code, and I'll test identity retrieval from the P-Associated-URI header like you suggested as well.
Are you sure? Looking at smf_init.sh, it looks like the 4G SMF config is only loaded if DEPLOY_MODE is set to 4G: https://github.com/herlesupreeth/docker_open5gs/blob/improve_5g_ims/smf/smf_init.sh#L40 When using deploy-all, its set to ALL tho, not 4G: https://github.com/herlesupreeth/docker_open5gs/blob/improve_5g_ims/deploy-all.yaml#L278 Maybe I'm missing something tho? |
@herlesupreeth cleaned up the unused, commented out code (see above commit). Also I added code to print the P-Associated-URI header element from the header if it's present in MO and MT and I'm not seeing it in any of the headers from my devices. Any ideas on why it's not showing? |
This completes the changes required for dynamic N5 and Rx route selection. This adds the following: - A table to store subscribers and their registration tech - A table to store contact subscribers for contact aors - An HTTP endpoint in S-CSCF to query for a IMPI given a public identity (ex. MSISDN) - Proper retrieval of the IMPI in MO and MT routes - Routing based on the registration tech retrieved from the table given an IMPI
@herlesupreeth I've updated the branch with some additional fixes and I've now been able to verify that calls work back and forth from an NR device to LTE (I can upload videos if you'd like verification). Not exactly sure why but my srsRAN setup suddenly stopped having the QoS PDU modification issues it was having previously. I've also included the srsRAN update commit from the old VoNR branch, and the SGsAP commit from the main branch to bring this branch up to date. Regarding the P-Associated-URI parsing, I still do not see that header element passed from my UEs in any of the packets received by the P-CSCF. I don't really like the S-CSCF connection either, so I'm going to look at better ways of doing this (perhaps saving it in the MySQL database or something), but as this chain stands right now, voice calls, video calls, and SMS all work between LTE and NR devices. I'm working on a new custom environment actually that may necessitate a better "tech storage" mechanism anyway: deploying an AMF, SMF, and UPF in a second location, with their own gNB, that all connect back to this main core. This will likely need a second P-CSCF at the second location as well, but still working on that. Let me know if you'd like to see anything else or address any additional concerns with this series. Thanks! |
Hey!! thanks a lot for addressing all the comments. Regarding P-Associated-URI, its sent only in 200OK for SIP REGISTER. Anyways, I will take a look at the changes done recently in this PR. |
Thanks! I’ll take a look at the 200 REGISTER message and see what I can parse there. One more note: I’ve been playing with the processes/children again, and it looks like that change may not be needed, so I may revert that (or remove from the series). But in doing so, I’ve found something strange: the P-CSCF opens A LOT of connections to the MySQL database. Like an insane number. With the old 4 children and 16 TCP processes, it opens around 100 connections to the MySQL database. With my current changes, it’s around 350. For comparison, S-CSCF opens one connection and SMSC opens two. Any ideas on why this might be happening? It definitely doesn’t seem normal, since really it shouldn’t need that many at all, maybe 4 connections or so for usrloc, pcscf_usrloc and ims_dialog, but maybe I’m missing something. |
Sadly no. I havent observed that closely |
This adds support for IMS routing between devices on LTE and NR.
Calls and SMS should work fine. I have tested SMS but cannot test calls due to still having QoS issues on NR, but according to the logs, it should work.
Additionally it adds some cleanup for the mysql database container. This makes it far more robust.