William Brown
2021-04-15 05:02:41 UTC
Hi everyone,
At the moment I have been helping a student with their higher education thesis. As part of this we need to understand realistic work loads from 389-ds servers in production.
To assist, I would like to ask if anyone is able to or willing to volunteer to submit sanitised content of their access logs to us for this. We can potentially also use these for 389-ds for benchmarking and simulation in the future.
An example of the sanitised log output is below. All DN's are replaced with UUID's that are randomly generated so that no data can be reversed from the content of the access log. The script uses a combination of filter and basedn uniqueness, and nentries to make a virtual tree, and then substitute in extra data as required. All rtimes are relative times of "when" the event occured relative to the start of the log, so we also do not see information about the time of accesses.
This data will likely be used in a public setting, so assume that it will be released if provided. Of course I encourage you to review the content of the sanitisation script, and the sanitised output so that you are comfortable to run this tool. It's worth noting the tool will likely use a lot of RAM, so you should NOT run it on your production server - rather you should copy the production access log to another machine and process it there.
1 hour to 24 hours of processed output from a production server would help a lot!
Please send the output as a response to this mail, or directly to me ( wbrown at suse dot de )
Thanks,
[
{
"etime": "0.005077600",
"ids": [
"9b207c6e-f7a2-4cd8-984a-415ad5e0960f"
],
"rtime": "0:00:36.219942",
"type": "add"
},
{
"etime": "0.000433300",
"ids": [
"9b207c6e-f7a2-4cd8-984a-415ad5e0960f"
],
"rtime": "0:00:36.225207",
"type": "bind"
},
{
"etime": "0.000893100",
"ids": [
"9b207c6e-f7a2-4cd8-984a-415ad5e0960f",
"eb2139a1-a0f3-41cf-bdbe-d213a75c6bb7"
],
"rtime": "0:00:40.165807",
"type": "srch"
},
]
USAGE:
python3 access_pattern_extract.py /path/to/log/access /path/to/output.json
At the moment I have been helping a student with their higher education thesis. As part of this we need to understand realistic work loads from 389-ds servers in production.
To assist, I would like to ask if anyone is able to or willing to volunteer to submit sanitised content of their access logs to us for this. We can potentially also use these for 389-ds for benchmarking and simulation in the future.
An example of the sanitised log output is below. All DN's are replaced with UUID's that are randomly generated so that no data can be reversed from the content of the access log. The script uses a combination of filter and basedn uniqueness, and nentries to make a virtual tree, and then substitute in extra data as required. All rtimes are relative times of "when" the event occured relative to the start of the log, so we also do not see information about the time of accesses.
This data will likely be used in a public setting, so assume that it will be released if provided. Of course I encourage you to review the content of the sanitisation script, and the sanitised output so that you are comfortable to run this tool. It's worth noting the tool will likely use a lot of RAM, so you should NOT run it on your production server - rather you should copy the production access log to another machine and process it there.
1 hour to 24 hours of processed output from a production server would help a lot!
Please send the output as a response to this mail, or directly to me ( wbrown at suse dot de )
Thanks,
[
{
"etime": "0.005077600",
"ids": [
"9b207c6e-f7a2-4cd8-984a-415ad5e0960f"
],
"rtime": "0:00:36.219942",
"type": "add"
},
{
"etime": "0.000433300",
"ids": [
"9b207c6e-f7a2-4cd8-984a-415ad5e0960f"
],
"rtime": "0:00:36.225207",
"type": "bind"
},
{
"etime": "0.000893100",
"ids": [
"9b207c6e-f7a2-4cd8-984a-415ad5e0960f",
"eb2139a1-a0f3-41cf-bdbe-d213a75c6bb7"
],
"rtime": "0:00:40.165807",
"type": "srch"
},
]
USAGE:
python3 access_pattern_extract.py /path/to/log/access /path/to/output.json